Opened 7 years ago

Closed 6 years ago

#16127 closed defect (fixed)

Fix comparison of PARI objects

Reported by: pbruin Owned by:
Priority: critical Milestone: sage-6.4
Component: interfaces Keywords: pari comparison
Cc: jdemeyer, was Merged in:
Authors: Peter Bruin Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: 8e5feb1 (Commits) Commit: 8e5feb10a893e872f463a6e15aa64f17de203720
Dependencies: Stopgaps:

Description (last modified by pbruin)

Comparison of PARI rationals is seriously broken:

sage: pari(1/2) < pari(1/3)
True
sage: pari(1) < pari(1/2)
True

Furthermore,the following PARI series and p-adics are incorrectly considered to be non-zero by == and !=:

sage: a = pari('O(x)'); a, a.type()
(O(x), 't_SER')
sage: b = pari('O(2)'); b, b.type()
(O(2), 't_PADIC')
sage: a.gequal(0)
True
sage: b.gequal(0)
True
sage: a == 0
False  # should be True
sage: b == 0
False  # should be True

These bugs are caused by __cmp__() calling the function gcmp_sage() in sage/libs/pari/misc.h (custom replacement for PARI's gcmp()), which uses string comparison for all types other than t_INT and t_REAL.

A possible solution is to implement gen._richcmp_c_impl() and to fix gen._cmp_c_impl() to use the standard PARI gcmp() instead of gcmp_sage(). This requires catching PARI "forbidden comparison" errors. There is actually a non-negligible speed-up if no such error occurs, but a substantial penalty if an error does occur.

Change History (33)

comment:1 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:2 Changed 6 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:3 Changed 6 years ago by pbruin

  • Branch set to u/pbruin/16127-gcmp
  • Cc jdemeyer added
  • Commit set to 9f3f4ea5223abfcedd5aae984129d4f27f68e1d0
  • Description modified (diff)
  • Priority changed from minor to critical
  • Summary changed from Comparison of PARI series and p-adics is broken to Fix comparison of PARI objects

The attached branch is just a first attempt; it needs documentation, and we have to time some important comparisons and make sure that there is no significant speed penalty in realistic use scenarios.

comment:4 Changed 6 years ago by git

  • Commit changed from 9f3f4ea5223abfcedd5aae984129d4f27f68e1d0 to a119b00099787f5cb8c3cc8477119184e16806ee

Branch pushed to git repo; I updated commit sha1. New commits:

594c546Trac 16127: remove gcmp_sage()
a119b00Trac 16127: speed up string comparison

comment:5 follow-ups: Changed 6 years ago by pbruin

  • Cc was added

Here are some timings for this experimental branch:

a = pari(1)
b = pari(1/2)
c = pari(3.0)
d = pari('x')
e = pari('O(x)')
f = pari('O(2)')
k.<t> = FiniteField(29^10)
u = t^2

Timings with %timeit -c -r 1

              cmp      cmp         ==       ==          <        <
           before    after     before    after     before    after

(a, a)     189 ns   173 ns     151 ns   144 ns     156 ns   154 ns
(a, b)    1.38 µs   292 ns     568 ns   284 ns     608 ns   208 ns
(a, c)     328 ns   292 ns     171 ns   276 ns     166 ns   204 ns
(a, d)    1.06 µs  22.4 µs     500 ns   312 ns     572 ns  14.4 µs
(a, e)     968 ns  16.6 µs     512 ns   352 ns     456 ns  13.7 µs
(a, f)    1.23 µs  15.6 µs     580 ns   448 ns     532 ns  15.4 µs
(b, b)     175 ns   174 ns     748 ns   153 ns     864 ns   216 ns
(b, c)    3.96 µs   340 ns    1.59 µs   356 ns    1.42 µs   224 ns
(b, d)    1.44 µs  26.4 µs     748 ns   284 ns     696 ns  13.8 µs
(b, e)    1.47 µs  23.2 µs     796 ns   296 ns     624 ns  15.6 µs
(b, f)    1.61 µs    26 µs     824 ns   432 ns     696 ns  21.2 µs
(c, c)     176 ns   178 ns     153 ns   171 ns     140 ns   161 ns
(c, d)    3.56 µs    26 µs    1.19 µs   280 ns    1.33 µs  21.2 µs
(c, e)    3.04 µs  27.6 µs    1.24 µs   248 ns    1.18 µs  19.6 µs
(c, f)    3.12 µs  22.4 µs    1.83 µs   360 ns     1.5 µs  13.7 µs
(d, d)     182 ns   180 ns     556 ns   158 ns     492 ns  13.4 µs
(d, e)    1.96 µs  19.2 µs     548 ns   236 ns     540 ns  21.2 µs
(d, f)    2.28 µs  18.8 µs     672 ns   284 ns     640 ns    22 µs
(e, e)     169 ns   174 ns     496 ns   148 ns     532 ns  23.2 µs
(e, f)    2.16 µs  19.6 µs     620 ns   340 ns     548 ns  17.2 µs
(f, f)     179 ns   166 ns     656 ns   160 ns     652 ns  22.8 µs
(t, t)     166 ns   179 ns     568 ns   588 ns     708 ns   632 ns
(t, u)    1.48 µs  1.36 µs     648 ns   632 ns     648 ns   656 ns

There is a nice speedup in most cases that do not involve a PariError, including practically all equality testings. However, the slowdown in the cases where it does occur is probably not acceptable. Maybe it is better to write a variant of gcmp_sage() that calls gcmp() and catches errors using PARI's pari_CATCH and pari_TRY macros so that we don't have to raise Python exceptions.

comment:6 in reply to: ↑ 5 Changed 6 years ago by jdemeyer

Replying to pbruin:

so that we don't have to raise Python exceptions.

Is it really raising the exception or it is handling the exception which takes a long time? If it's the second case, we might try some Cython machinery to speed up handling the exception.

comment:7 in reply to: ↑ 5 ; follow-up: Changed 6 years ago by jdemeyer

Replying to pbruin:

Maybe it is better to write a variant of gcmp_sage() that calls gcmp() and catches errors using PARI's pari_CATCH and pari_TRY macros so that we don't have to raise Python exceptions.

If you do this, I would prefer to see this function in the PARI library (call it gcmp_any() for example). Hopefully, this can be done in a way which is acceptable for upstream.

comment:8 in reply to: ↑ 7 Changed 6 years ago by pbruin

Replying to jdemeyer:

Replying to pbruin:

Maybe it is better to write a variant of gcmp_sage() that calls gcmp() and catches errors using PARI's pari_CATCH and pari_TRY macros so that we don't have to raise Python exceptions.

If you do this, I would prefer to see this function in the PARI library (call it gcmp_any() for example). Hopefully, this can be done in a way which is acceptable for upstream.

I ended up practically copying the existing (static) gen2.c:gequal_try() and calling it gcmp_try(). PARI already has another (somewhat different) universal comparison function cmp_universal(); I don't think the PARI library currently has a use for this gcmp_try().

New branch coming soon, but here are the new timings (same notation as in comment:5), which leave nothing more to wish for in my opinion:

              cmp      cmp         ==       ==          <        <
           before    after     before    after     before    after

(a, a)     189 ns   176 ns     151 ns   142 ns     156 ns   156 ns
(a, b)    1.38 µs   324 ns     568 ns   520 ns     608 ns   212 ns
(a, c)     328 ns   276 ns     171 ns   272 ns     166 ns   178 ns
(a, d)    1.06 µs   984 ns     500 ns   284 ns     572 ns   300 ns
(a, e)     968 ns  1.27 µs     512 ns   224 ns     456 ns   332 ns
(a, f)    1.23 µs  1.24 µs     580 ns   408 ns     532 ns   336 ns
(b, b)     175 ns   164 ns     748 ns   157 ns     864 ns   216 ns
(b, c)    3.96 µs   348 ns    1.59 µs   348 ns    1.42 µs   232 ns
(b, d)    1.44 µs  1.38 µs     748 ns   336 ns     696 ns   352 ns
(b, e)    1.47 µs     1 µs     796 ns   264 ns     624 ns   336 ns
(b, f)    1.61 µs  1.24 µs     824 ns   404 ns     696 ns   480 ns
(c, c)     176 ns   180 ns     153 ns   156 ns     140 ns   186 ns
(c, d)    3.56 µs  2.52 µs    1.19 µs   288 ns    1.33 µs   348 ns
(c, e)    3.04 µs  2.72 µs    1.24 µs   216 ns    1.18 µs   324 ns
(c, f)    3.12 µs  2.52 µs    1.83 µs   288 ns     1.5 µs   292 ns
(d, d)     182 ns   175 ns     556 ns   156 ns     492 ns   296 ns
(d, e)    1.96 µs     1 µs     548 ns   232 ns     540 ns   328 ns
(d, f)    2.28 µs  1.29 µs     672 ns   288 ns     640 ns   312 ns
(e, e)     169 ns   176 ns     496 ns   153 ns     532 ns   328 ns
(e, f)    2.16 µs   992 ns     620 ns   296 ns     548 ns   296 ns
(f, f)     179 ns   181 ns     656 ns   158 ns     652 ns   328 ns
(t, t)     166 ns   180 ns     568 ns   688 ns     708 ns   752 ns
(t, u)    1.48 µs  1.48 µs     648 ns   788 ns     648 ns   772 ns

comment:9 follow-up: Changed 6 years ago by jdemeyer

If you're adding patches to PARI, please be aware of #16997 and make them compatible with PARI-2.8.

comment:10 Changed 6 years ago by pbruin

  • Authors set to Peter Bruin
  • Branch changed from u/pbruin/16127-gcmp to u/pbruin/16127-pari_comparison
  • Commit changed from a119b00099787f5cb8c3cc8477119184e16806ee to 381b2fc91b003d69ba16931953ba0f9dc50a6205
  • Status changed from new to needs_review

comment:11 in reply to: ↑ 9 Changed 6 years ago by pbruin

Replying to jdemeyer:

If you're adding patches to PARI, please be aware of #16997 and make them compatible with PARI-2.8.

It seemed to be best to add the new functions gcmp_try() and gcmp_string() as inline functions to sage/libs/pari/misc.h, replacing gcmp_sage(). There should be no problem with newer PARI versions, since I adapted gcmp_try() from gequal_try() in a recent development version.

comment:12 Changed 6 years ago by jdemeyer

  • Reviewers set to Jeroen Demeyer

comment:13 Changed 6 years ago by jdemeyer

Reviewer patch ready, currently running make ptestlong...

comment:14 Changed 6 years ago by jdemeyer

  • Branch changed from u/pbruin/16127-pari_comparison to u/jdemeyer/ticket/16127
  • Created changed from 04/11/14 00:03:03 to 04/11/14 00:03:03
  • Modified changed from 09/19/14 19:02:15 to 09/19/14 19:02:15

comment:15 Changed 6 years ago by jdemeyer

  • Commit changed from 381b2fc91b003d69ba16931953ba0f9dc50a6205 to 782b83a7ac3a06a1d7de779a3f72de40fbc73d0b

New commits:

782b83aMake <= and >= return True for equal PARI objects

comment:16 Changed 6 years ago by git

  • Commit changed from 782b83a7ac3a06a1d7de779a3f72de40fbc73d0b to 12fbcad65c6438fe0495c3b2424468f24b49a9ee

Branch pushed to git repo; I updated commit sha1. New commits:

12fbcadChange doctest

comment:17 Changed 6 years ago by jdemeyer

All doctests pass now. If you're happy with my changes, you can set this ticket to positive_review.

comment:18 follow-up: Changed 6 years ago by pbruin

Why do you make <= and >= return True for equal objects? To me it doesn't make a lot of sense. I can see that you might want that a == b implies a <= b, but in my opinion asking whether a <= b only makes sense if a and b are in some partially ordered set (like the real numbers with plus and minus infinity).

Neither PARI nor Sage returns true for [0] <= 0 (while they disagree on [0] == 0):

gp > [0] <= 0
  ***   at top-level: [0]<=0
  ***                    ^---
  *** _<=_: forbidden comparison t_VEC (1 elts) , t_INT.
gp > [0] == 0
%1 = 1
sage: [0] <= 0
False
sage: [0] == 0
False

Note also the following from the Python documentation: "There are no implied relationships among the comparison operators. The truth of x==y does not imply that x!=y is false." Likewise, I don't think we should insist that the truth of x==y implies that of x<=y.

comment:19 Changed 6 years ago by pbruin

By the way, I think that we should in the end (but on a different ticket) get rid of gcmp_string() and make _cmp_c_impl() a wrapper for PARI's cmp_universal(). This is faster and defines a total ordering, while being mathematically no more or less sensible than string comparison. As far as I know, once you have the rich comparison operators, the only real remaining use for cmp() is sorting arbitrary lists; there is probably no strong reason why it should be compatible with the individual comparison operators.

comment:20 follow-up: Changed 6 years ago by pbruin

My knowledge of various C standards is less than optimal, but is the following (variable declaration after statement) generally accepted?

sig_block();
char *a = GENtostr(x);

comment:21 in reply to: ↑ 20 Changed 6 years ago by jdemeyer

Replying to pbruin:

My knowledge of various C standards is less than optimal, but is the following (variable declaration after statement) generally accepted?

Yes, that's valid C89.

comment:22 in reply to: ↑ 18 ; follow-up: Changed 6 years ago by jdemeyer

  • Status changed from needs_review to needs_info

Replying to pbruin:

in my opinion asking whether a <= b only makes sense if a and b are in some partially ordered set (like the real numbers with plus and minus infinity).

In that case, we should give an exception when asking for a <= b. I think returning False is more wrong than returning True when the objects are equal.

Note also the following from the ​Python documentation: "There are no implied relationships among the comparison operators. The truth of x==y does not imply that x!=y is false." Likewise, I don't think we should insist that the truth of x==y implies that of x<=y.

Sure, there is no implication, but I find it strange to have 2 objects such that x == y is True but x <= y is False.

comment:23 in reply to: ↑ 22 Changed 6 years ago by pbruin

Replying to jdemeyer:

Replying to pbruin:

in my opinion asking whether a <= b only makes sense if a and b are in some partially ordered set (like the real numbers with plus and minus infinity).

In that case, we should give an exception when asking for a <= b. I think returning False is more wrong than returning True when the objects are equal.

OK, I think raising an error is conceptually and technically the right solution. Earlier I wrongly convinced myself that comparison operators should never fail, but since they can, it makes sense to keep the behaviour of gen objects under comparison as close as possible to what PARI does, if only for efficiency reasons.

It turns out that there are only a few places where Sage tries to sort unordered PARI objects (polynomials, in fact). We can basically use list.sort(cmp=cmp) to force the use of __cmp__() instead of the rich comparison operators. In two cases (enumerate_totallyreal_fields_*) it is a bit more tricky, but after the necessary modifications the code actually becomes cleaner in my opinion (and not measurably slower).

Now testing if this approach works, and if it keeps working after experimentally replacing string comparison by PARI's cmp_universal().

comment:24 Changed 6 years ago by pbruin

  • Branch changed from u/jdemeyer/ticket/16127 to u/pbruin/16127-pari_comparison
  • Commit changed from 12fbcad65c6438fe0495c3b2424468f24b49a9ee to 0e0b4da393fa63c54e6f8c3ade0309ec198c2ffe
  • Status changed from needs_info to needs_review

(comment:23 explains the last commit)

Last edited 6 years ago by pbruin (previous) (diff)

comment:25 follow-up: Changed 6 years ago by jdemeyer

I'm slightly worried about the fact that sorting using cmp is not supported in Python 3. It seems the Python 3 way is to define either a sorting key (which isn't applicable here) or just use the rich comparison operators. This would be a reason to keep richcmp and cmp compatible.

comment:26 in reply to: ↑ 25 Changed 6 years ago by pbruin

Replying to jdemeyer:

I'm slightly worried about the fact that sorting using cmp is not supported in Python 3.

I find this a strange choice, but I guess we'll have to live with it...

It seems the Python 3 way is to define either a sorting key (which isn't applicable here) or just use the rich comparison operators.

There are various solutions:

  1. Python (3, at least) has a function cmp_to_key that converts a cmp function into a key function, so instead of cmp=cmpfunc you can specify key=cmp_to_key(cmpfunc). (The key returned by cmp_to_key(cmpfunc) for a given object obj is a new object wrapping obj with comparison operators implemented using cmpfunc.) This looks artificial and inefficient.
  2. When sorting polynomials (which are the only kind of non-ordered PARI objects that is currently sorted somewhere in the Sage library), one can specify key=lambda f: f.list().
  3. Any list of PARI objects can be sorted by converting it into a t_VEC and using one of several built-in PARI sorting functions.

I don't think the disappearance of the cmp keyword for sorting is serious enough that it forces us to keep __cmp__() and __richcmp__() consistent. (In principle, if we adopt 2. or 3. above, we could even think about removing gen._cmp_c_impl(), so the generic Element code automatically falls back to using rich comparison to implement __cmp__(). I'm not sure if this is a good idea, though.)

[Edit: in fact, Python 3 completely removes the built-in function `cmp()` and the special method `__cmp__()`.]

Last edited 6 years ago by pbruin (previous) (diff)

comment:27 follow-up: Changed 6 years ago by jdemeyer

Agreed with your last comment.

In the code for totally real fields, I wonder whether it is really needed to sort on "polynomials". Sorting by discriminant seems sensible (and that's also what is documented), but since there is no meaningful comparison for polynomials, why bother in the first place? So my proposal would be to replace

S.sort(cmp=lambda x, y: cmp(x[0], y[0]) or cmp(x[1], y[1]))

by

S.sort(key=lambda x: x[0])

comment:28 in reply to: ↑ 27 Changed 6 years ago by pbruin

Replying to jdemeyer:

In the code for totally real fields, I wonder whether it is really needed to sort on "polynomials".

Well, I just tried to change the existing code (which does the sorting in this way) as little as possible. Also, after #17026 (and to a certain extent already when using string comparison), comparison of polynomials isn't totally meaningless either: they are first sorted by degree and then lexicographically.

comment:29 Changed 6 years ago by jdemeyer

  • Status changed from needs_review to positive_review

I don't agree completely with everything you said, but it's fine for me.

comment:30 Changed 6 years ago by jdemeyer

  • Status changed from positive_review to needs_work

Doctest failure on 32-bit:

sage -t --warn-long 61.0 src/sage/libs/pari/gen.pyx
**********************************************************************
File "src/sage/libs/pari/gen.pyx", line 5961, in sage.libs.pari.gen.gen.elleta
Failed example:
    w1*eta2 - w2*eta1 - pari(2*pi*I) == 0
Expected:
    True
Got:  
    False
**********************************************************************

I think it's best to just remove that test.

comment:31 Changed 6 years ago by git

  • Commit changed from 0e0b4da393fa63c54e6f8c3ade0309ec198c2ffe to 8e5feb10a893e872f463a6e15aa64f17de203720

Branch pushed to git repo; I updated commit sha1. New commits:

8e5feb1Trac 16127: remove doctest

comment:32 Changed 6 years ago by pbruin

  • Status changed from needs_work to positive_review

comment:33 Changed 6 years ago by vbraun

  • Branch changed from u/pbruin/16127-pari_comparison to 8e5feb10a893e872f463a6e15aa64f17de203720
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.