Opened 7 years ago
Closed 6 years ago
#16127 closed defect (fixed)
Fix comparison of PARI objects
Reported by: | pbruin | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | sage-6.4 |
Component: | interfaces | Keywords: | pari comparison |
Cc: | jdemeyer, was | Merged in: | |
Authors: | Peter Bruin | Reviewers: | Jeroen Demeyer |
Report Upstream: | N/A | Work issues: | |
Branch: | 8e5feb1 (Commits) | Commit: | 8e5feb10a893e872f463a6e15aa64f17de203720 |
Dependencies: | Stopgaps: |
Description (last modified by )
Comparison of PARI rationals is seriously broken:
sage: pari(1/2) < pari(1/3) True sage: pari(1) < pari(1/2) True
Furthermore,the following PARI series and p-adics are incorrectly considered to be non-zero by ==
and !=
:
sage: a = pari('O(x)'); a, a.type() (O(x), 't_SER') sage: b = pari('O(2)'); b, b.type() (O(2), 't_PADIC') sage: a.gequal(0) True sage: b.gequal(0) True sage: a == 0 False # should be True sage: b == 0 False # should be True
These bugs are caused by __cmp__()
calling the function gcmp_sage()
in sage/libs/pari/misc.h
(custom replacement for PARI's gcmp()
), which uses string comparison for all types other than t_INT
and t_REAL
.
A possible solution is to implement gen._richcmp_c_impl()
and to fix gen._cmp_c_impl()
to use the standard PARI gcmp()
instead of gcmp_sage()
. This requires catching PARI "forbidden comparison" errors. There is actually a non-negligible speed-up if no such error occurs, but a substantial penalty if an error does occur.
Change History (33)
comment:1 Changed 7 years ago by
- Milestone changed from sage-6.2 to sage-6.3
comment:2 Changed 6 years ago by
- Milestone changed from sage-6.3 to sage-6.4
comment:3 Changed 6 years ago by
- Branch set to u/pbruin/16127-gcmp
- Cc jdemeyer added
- Commit set to 9f3f4ea5223abfcedd5aae984129d4f27f68e1d0
- Description modified (diff)
- Priority changed from minor to critical
- Summary changed from Comparison of PARI series and p-adics is broken to Fix comparison of PARI objects
comment:4 Changed 6 years ago by
- Commit changed from 9f3f4ea5223abfcedd5aae984129d4f27f68e1d0 to a119b00099787f5cb8c3cc8477119184e16806ee
comment:5 follow-ups: ↓ 6 ↓ 7 Changed 6 years ago by
- Cc was added
Here are some timings for this experimental branch:
a = pari(1) b = pari(1/2) c = pari(3.0) d = pari('x') e = pari('O(x)') f = pari('O(2)') k.<t> = FiniteField(29^10) u = t^2 Timings with %timeit -c -r 1 cmp cmp == == < < before after before after before after (a, a) 189 ns 173 ns 151 ns 144 ns 156 ns 154 ns (a, b) 1.38 µs 292 ns 568 ns 284 ns 608 ns 208 ns (a, c) 328 ns 292 ns 171 ns 276 ns 166 ns 204 ns (a, d) 1.06 µs 22.4 µs 500 ns 312 ns 572 ns 14.4 µs (a, e) 968 ns 16.6 µs 512 ns 352 ns 456 ns 13.7 µs (a, f) 1.23 µs 15.6 µs 580 ns 448 ns 532 ns 15.4 µs (b, b) 175 ns 174 ns 748 ns 153 ns 864 ns 216 ns (b, c) 3.96 µs 340 ns 1.59 µs 356 ns 1.42 µs 224 ns (b, d) 1.44 µs 26.4 µs 748 ns 284 ns 696 ns 13.8 µs (b, e) 1.47 µs 23.2 µs 796 ns 296 ns 624 ns 15.6 µs (b, f) 1.61 µs 26 µs 824 ns 432 ns 696 ns 21.2 µs (c, c) 176 ns 178 ns 153 ns 171 ns 140 ns 161 ns (c, d) 3.56 µs 26 µs 1.19 µs 280 ns 1.33 µs 21.2 µs (c, e) 3.04 µs 27.6 µs 1.24 µs 248 ns 1.18 µs 19.6 µs (c, f) 3.12 µs 22.4 µs 1.83 µs 360 ns 1.5 µs 13.7 µs (d, d) 182 ns 180 ns 556 ns 158 ns 492 ns 13.4 µs (d, e) 1.96 µs 19.2 µs 548 ns 236 ns 540 ns 21.2 µs (d, f) 2.28 µs 18.8 µs 672 ns 284 ns 640 ns 22 µs (e, e) 169 ns 174 ns 496 ns 148 ns 532 ns 23.2 µs (e, f) 2.16 µs 19.6 µs 620 ns 340 ns 548 ns 17.2 µs (f, f) 179 ns 166 ns 656 ns 160 ns 652 ns 22.8 µs (t, t) 166 ns 179 ns 568 ns 588 ns 708 ns 632 ns (t, u) 1.48 µs 1.36 µs 648 ns 632 ns 648 ns 656 ns
There is a nice speedup in most cases that do not involve a PariError
, including practically all equality testings. However, the slowdown in the cases where it does occur is probably not acceptable. Maybe it is better to write a variant of gcmp_sage()
that calls gcmp()
and catches errors using PARI's pari_CATCH
and pari_TRY
macros so that we don't have to raise Python exceptions.
comment:6 in reply to: ↑ 5 Changed 6 years ago by
Replying to pbruin:
so that we don't have to raise Python exceptions.
Is it really raising the exception or it is handling the exception which takes a long time? If it's the second case, we might try some Cython machinery to speed up handling the exception.
comment:7 in reply to: ↑ 5 ; follow-up: ↓ 8 Changed 6 years ago by
Replying to pbruin:
Maybe it is better to write a variant of
gcmp_sage()
that callsgcmp()
and catches errors using PARI'spari_CATCH
andpari_TRY
macros so that we don't have to raise Python exceptions.
If you do this, I would prefer to see this function in the PARI library (call it gcmp_any()
for example). Hopefully, this can be done in a way which is acceptable for upstream.
comment:8 in reply to: ↑ 7 Changed 6 years ago by
Replying to jdemeyer:
Replying to pbruin:
Maybe it is better to write a variant of
gcmp_sage()
that callsgcmp()
and catches errors using PARI'spari_CATCH
andpari_TRY
macros so that we don't have to raise Python exceptions.If you do this, I would prefer to see this function in the PARI library (call it
gcmp_any()
for example). Hopefully, this can be done in a way which is acceptable for upstream.
I ended up practically copying the existing (static) gen2.c:gequal_try()
and calling it gcmp_try()
. PARI already has another (somewhat different) universal comparison function cmp_universal()
; I don't think the PARI library currently has a use for this gcmp_try()
.
New branch coming soon, but here are the new timings (same notation as in comment:5), which leave nothing more to wish for in my opinion:
cmp cmp == == < < before after before after before after (a, a) 189 ns 176 ns 151 ns 142 ns 156 ns 156 ns (a, b) 1.38 µs 324 ns 568 ns 520 ns 608 ns 212 ns (a, c) 328 ns 276 ns 171 ns 272 ns 166 ns 178 ns (a, d) 1.06 µs 984 ns 500 ns 284 ns 572 ns 300 ns (a, e) 968 ns 1.27 µs 512 ns 224 ns 456 ns 332 ns (a, f) 1.23 µs 1.24 µs 580 ns 408 ns 532 ns 336 ns (b, b) 175 ns 164 ns 748 ns 157 ns 864 ns 216 ns (b, c) 3.96 µs 348 ns 1.59 µs 348 ns 1.42 µs 232 ns (b, d) 1.44 µs 1.38 µs 748 ns 336 ns 696 ns 352 ns (b, e) 1.47 µs 1 µs 796 ns 264 ns 624 ns 336 ns (b, f) 1.61 µs 1.24 µs 824 ns 404 ns 696 ns 480 ns (c, c) 176 ns 180 ns 153 ns 156 ns 140 ns 186 ns (c, d) 3.56 µs 2.52 µs 1.19 µs 288 ns 1.33 µs 348 ns (c, e) 3.04 µs 2.72 µs 1.24 µs 216 ns 1.18 µs 324 ns (c, f) 3.12 µs 2.52 µs 1.83 µs 288 ns 1.5 µs 292 ns (d, d) 182 ns 175 ns 556 ns 156 ns 492 ns 296 ns (d, e) 1.96 µs 1 µs 548 ns 232 ns 540 ns 328 ns (d, f) 2.28 µs 1.29 µs 672 ns 288 ns 640 ns 312 ns (e, e) 169 ns 176 ns 496 ns 153 ns 532 ns 328 ns (e, f) 2.16 µs 992 ns 620 ns 296 ns 548 ns 296 ns (f, f) 179 ns 181 ns 656 ns 158 ns 652 ns 328 ns (t, t) 166 ns 180 ns 568 ns 688 ns 708 ns 752 ns (t, u) 1.48 µs 1.48 µs 648 ns 788 ns 648 ns 772 ns
comment:9 follow-up: ↓ 11 Changed 6 years ago by
If you're adding patches to PARI, please be aware of #16997 and make them compatible with PARI-2.8.
comment:10 Changed 6 years ago by
- Branch changed from u/pbruin/16127-gcmp to u/pbruin/16127-pari_comparison
- Commit changed from a119b00099787f5cb8c3cc8477119184e16806ee to 381b2fc91b003d69ba16931953ba0f9dc50a6205
- Status changed from new to needs_review
comment:11 in reply to: ↑ 9 Changed 6 years ago by
Replying to jdemeyer:
If you're adding patches to PARI, please be aware of #16997 and make them compatible with PARI-2.8.
It seemed to be best to add the new functions gcmp_try()
and gcmp_string()
as inline functions to sage/libs/pari/misc.h
, replacing gcmp_sage()
. There should be no problem with newer PARI versions, since I adapted gcmp_try()
from gequal_try()
in a recent development version.
comment:12 Changed 6 years ago by
- Reviewers set to Jeroen Demeyer
comment:13 Changed 6 years ago by
Reviewer patch ready, currently running make ptestlong
...
comment:14 Changed 6 years ago by
- Branch changed from u/pbruin/16127-pari_comparison to u/jdemeyer/ticket/16127
- Created changed from 04/11/14 00:03:03 to 04/11/14 00:03:03
- Modified changed from 09/19/14 19:02:15 to 09/19/14 19:02:15
comment:15 Changed 6 years ago by
- Commit changed from 381b2fc91b003d69ba16931953ba0f9dc50a6205 to 782b83a7ac3a06a1d7de779a3f72de40fbc73d0b
New commits:
782b83a | Make <= and >= return True for equal PARI objects
|
comment:16 Changed 6 years ago by
- Commit changed from 782b83a7ac3a06a1d7de779a3f72de40fbc73d0b to 12fbcad65c6438fe0495c3b2424468f24b49a9ee
Branch pushed to git repo; I updated commit sha1. New commits:
12fbcad | Change doctest
|
comment:17 Changed 6 years ago by
All doctests pass now. If you're happy with my changes, you can set this ticket to positive_review.
comment:18 follow-up: ↓ 22 Changed 6 years ago by
Why do you make <=
and >=
return True
for equal objects? To me it doesn't make a lot of sense. I can see that you might want that a == b
implies a <= b
, but in my opinion asking whether a <= b
only makes sense if a
and b
are in some partially ordered set (like the real numbers with plus and minus infinity).
Neither PARI nor Sage returns true for [0] <= 0
(while they disagree on [0] == 0
):
gp > [0] <= 0 *** at top-level: [0]<=0 *** ^--- *** _<=_: forbidden comparison t_VEC (1 elts) , t_INT. gp > [0] == 0 %1 = 1
sage: [0] <= 0 False sage: [0] == 0 False
Note also the following from the Python documentation: "There are no implied relationships among the comparison operators. The truth of x==y
does not imply that x!=y
is false." Likewise, I don't think we should insist that the truth of x==y
implies that of x<=y
.
comment:19 Changed 6 years ago by
By the way, I think that we should in the end (but on a different ticket) get rid of gcmp_string()
and make _cmp_c_impl()
a wrapper for PARI's cmp_universal()
. This is faster and defines a total ordering, while being mathematically no more or less sensible than string comparison. As far as I know, once you have the rich comparison operators, the only real remaining use for cmp()
is sorting arbitrary lists; there is probably no strong reason why it should be compatible with the individual comparison operators.
comment:20 follow-up: ↓ 21 Changed 6 years ago by
My knowledge of various C standards is less than optimal, but is the following (variable declaration after statement) generally accepted?
sig_block(); char *a = GENtostr(x);
comment:21 in reply to: ↑ 20 Changed 6 years ago by
Replying to pbruin:
My knowledge of various C standards is less than optimal, but is the following (variable declaration after statement) generally accepted?
Yes, that's valid C89.
comment:22 in reply to: ↑ 18 ; follow-up: ↓ 23 Changed 6 years ago by
- Status changed from needs_review to needs_info
Replying to pbruin:
in my opinion asking whether
a <= b
only makes sense ifa
andb
are in some partially ordered set (like the real numbers with plus and minus infinity).
In that case, we should give an exception when asking for a <= b
. I think returning False is more wrong than returning True when the objects are equal.
Note also the following from the Python documentation: "There are no implied relationships among the comparison operators. The truth of x==y does not imply that x!=y is false." Likewise, I don't think we should insist that the truth of x==y implies that of x<=y.
Sure, there is no implication, but I find it strange to have 2 objects such that x == y
is True but x <= y
is False.
comment:23 in reply to: ↑ 22 Changed 6 years ago by
Replying to jdemeyer:
Replying to pbruin:
in my opinion asking whether
a <= b
only makes sense ifa
andb
are in some partially ordered set (like the real numbers with plus and minus infinity).In that case, we should give an exception when asking for
a <= b
. I think returning False is more wrong than returning True when the objects are equal.
OK, I think raising an error is conceptually and technically the right solution. Earlier I wrongly convinced myself that comparison operators should never fail, but since they can, it makes sense to keep the behaviour of gen
objects under comparison as close as possible to what PARI does, if only for efficiency reasons.
It turns out that there are only a few places where Sage tries to sort unordered PARI objects (polynomials, in fact). We can basically use list.sort(cmp=cmp)
to force the use of __cmp__()
instead of the rich comparison operators. In two cases (enumerate_totallyreal_fields_*
) it is a bit more tricky, but after the necessary modifications the code actually becomes cleaner in my opinion (and not measurably slower).
Now testing if this approach works, and if it keeps working after experimentally replacing string comparison by PARI's cmp_universal()
.
comment:24 Changed 6 years ago by
- Branch changed from u/jdemeyer/ticket/16127 to u/pbruin/16127-pari_comparison
- Commit changed from 12fbcad65c6438fe0495c3b2424468f24b49a9ee to 0e0b4da393fa63c54e6f8c3ade0309ec198c2ffe
- Status changed from needs_info to needs_review
(comment:23 explains the last commit)
comment:25 follow-up: ↓ 26 Changed 6 years ago by
I'm slightly worried about the fact that sorting using cmp
is not supported in Python 3. It seems the Python 3 way is to define either a sorting key (which isn't applicable here) or just use the rich comparison operators. This would be a reason to keep richcmp
and cmp
compatible.
comment:26 in reply to: ↑ 25 Changed 6 years ago by
Replying to jdemeyer:
I'm slightly worried about the fact that sorting using
cmp
is not supported in Python 3.
I find this a strange choice, but I guess we'll have to live with it...
It seems the Python 3 way is to define either a sorting key (which isn't applicable here) or just use the rich comparison operators.
There are various solutions:
- Python (3, at least) has a function
cmp_to_key
that converts acmp
function into akey
function, so instead ofcmp=cmpfunc
you can specifykey=cmp_to_key(cmpfunc)
. (The key returned bycmp_to_key(cmpfunc)
for a given objectobj
is a new object wrappingobj
with comparison operators implemented usingcmpfunc
.) This looks artificial and inefficient. - When sorting polynomials (which are the only kind of non-ordered PARI objects that is currently sorted somewhere in the Sage library), one can specify
key=lambda f: f.list()
. - Any list of PARI objects can be sorted by converting it into a
t_VEC
and using one of several built-in PARI sorting functions.
I don't think the disappearance of the cmp
keyword for sorting is serious enough that it forces us to keep __cmp__()
and __richcmp__()
consistent. (In principle, if we adopt 2. or 3. above, we could even think about removing gen._cmp_c_impl()
, so the generic Element
code automatically falls back to using rich comparison to implement __cmp__()
. I'm not sure if this is a good idea, though.)
[Edit: in fact, Python 3 completely removes the built-in function `cmp()` and the special method `__cmp__()`.]
comment:27 follow-up: ↓ 28 Changed 6 years ago by
Agreed with your last comment.
In the code for totally real fields, I wonder whether it is really needed to sort on "polynomials". Sorting by discriminant seems sensible (and that's also what is documented), but since there is no meaningful comparison for polynomials, why bother in the first place? So my proposal would be to replace
S.sort(cmp=lambda x, y: cmp(x[0], y[0]) or cmp(x[1], y[1]))
by
S.sort(key=lambda x: x[0])
comment:28 in reply to: ↑ 27 Changed 6 years ago by
Replying to jdemeyer:
In the code for totally real fields, I wonder whether it is really needed to sort on "polynomials".
Well, I just tried to change the existing code (which does the sorting in this way) as little as possible. Also, after #17026 (and to a certain extent already when using string comparison), comparison of polynomials isn't totally meaningless either: they are first sorted by degree and then lexicographically.
comment:29 Changed 6 years ago by
- Status changed from needs_review to positive_review
I don't agree completely with everything you said, but it's fine for me.
comment:30 Changed 6 years ago by
- Status changed from positive_review to needs_work
Doctest failure on 32-bit:
sage -t --warn-long 61.0 src/sage/libs/pari/gen.pyx ********************************************************************** File "src/sage/libs/pari/gen.pyx", line 5961, in sage.libs.pari.gen.gen.elleta Failed example: w1*eta2 - w2*eta1 - pari(2*pi*I) == 0 Expected: True Got: False **********************************************************************
I think it's best to just remove that test.
comment:31 Changed 6 years ago by
- Commit changed from 0e0b4da393fa63c54e6f8c3ade0309ec198c2ffe to 8e5feb10a893e872f463a6e15aa64f17de203720
Branch pushed to git repo; I updated commit sha1. New commits:
8e5feb1 | Trac 16127: remove doctest
|
comment:32 Changed 6 years ago by
- Status changed from needs_work to positive_review
comment:33 Changed 6 years ago by
- Branch changed from u/pbruin/16127-pari_comparison to 8e5feb10a893e872f463a6e15aa64f17de203720
- Resolution set to fixed
- Status changed from positive_review to closed
The attached branch is just a first attempt; it needs documentation, and we have to time some important comparisons and make sure that there is no significant speed penalty in realistic use scenarios.