Opened 5 years ago

Closed 5 years ago

#23925 closed defect (fixed)

Bulk fix of signal handling in symbolics

Reported by: Ralf Stephan Owned by:
Priority: critical Milestone: sage-8.1
Component: cython Keywords:
Cc: Merged in:
Authors: Ralf Stephan Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: b4a620d (Commits, GitHub, GitLab) Commit: b4a620d01d122b254b119dd4fa0b9573c598f8f0
Dependencies: Stopgaps:

Status badges

Description (last modified by Jeroen Demeyer)

In Sage-8.1.beta6 this crashes:

sage: (x/(x^1431655765-1)).normalize()
Exception (FLINT memory_manager). Unable to allocate memory.
------------------------------------------------------------------------
/usr/local/src/sage-config/local/lib/python2.7/site-packages/cysignals/signals.so(+0x5db8)[0x7fb72168edb8]
/usr/local/src/sage-config/local/lib/python2.7/site-packages/cysignals/signals.so(+0x5e25)[0x7fb72168ee25]
/usr/local/src/sage-config/local/lib/python2.7/site-packages/cysignals/signals.so(+0x8aba)[0x7fb721691aba]
/lib64/libpthread.so.0(+0x10860)[0x7fb729d0a860]
/lib64/libc.so.6(gsignal+0x37)[0x7fb729299fa7]
/lib64/libc.so.6(abort+0x16a)[0x7fb72929b30a]
/usr/local/src/sage-config/local/lib/libflint.so.13(+0x4d104)[0x7fb7148a1104]
/usr/local/src/sage-config/local/lib/libflint.so.13(+0x534b8)[0x7fb7148a74b8]
/usr/local/src/sage-config/local/lib/libflint.so.13(fmpz_poly_init2+0x3d)[0x7fb7148f435d]
/usr/local/src/sage-config/local/lib/libfactory-4.1.0.so(_Z24convertFacCF2Fmpz_poly_tP16fmpz_poly_structRK13CanonicalForm+0x3a)[0x7fb68fe03b7a]
/usr/local/src/sage-config/local/lib/libfactory-4.1.0.so(_Z17gcd_univar_flint0RK13CanonicalFormS1_+0x37)[0x7fb68fd4f817]
/usr/local/src/sage-config/local/lib/libfactory-4.1.0.so(_Z11subResGCD_0RK13CanonicalFormS1_+0x611)[0x7fb68fd4f5d1]
/usr/local/src/sage-config/local/lib/libfactory-4.1.0.so(_Z8gcd_polyRK13CanonicalFormS1_+0x128)[0x7fb68fcfaaa8]
/usr/local/src/sage-config/local/lib/libfactory-4.1.0.so(_Z3gcdRK13CanonicalFormS1_+0x3c4)[0x7fb68fcfb0d4]
/usr/local/src/sage-config/local/lib/libpynac.so.14(_ZN5GiNaC7gcdpolyERKNS_2exES2_PS0_S3_b+0x259)[0x7fb68915f159]
/usr/local/src/sage-config/local/lib/libpynac.so.14(+0x120312)[0x7fb689176312]
/usr/local/src/sage-config/local/lib/libpynac.so.14(_ZNK5GiNaC3mul6normalERSt3mapINS_2exES2_NS_10ex_is_lessESaISt4pairIKS2_S2_EEES9_ij+0x417)[0x7fb689177587]
/usr/local/src/sage-config/local/lib/libpynac.so.14(_ZNK5GiNaC2ex6normalEibb+0xd6)[0x7fb689171446]
/usr/local/src/sage-config/local/lib/python2.7/site-packages/sage/symbolic/expression.so(+0xd4daf)[0x7fb688da5daf]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x84cc)[0x7fb72a028cdc]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x19)[0x7fb72a02a609]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x82cc)[0x7fb72a028adc]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7f9f)[0x7fb72a0287af]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7f9f)[0x7fb72a0287af]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7f9f)[0x7fb72a0287af]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7f9f)[0x7fb72a0287af]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7f9f)[0x7fb72a0287af]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7f9f)[0x7fb72a0287af]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x830)[0x7fb72a02a4e0]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x19)[0x7fb72a02a609]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyRun_FileExFlags+0x8a)[0x7fb72a04f4da]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0xe7)[0x7fb72a050a17]
/usr/local/src/sage-config/local/lib/libpython2.7.so.1.0(Py_Main+0xc7a)[0x7fb72a06760a]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fb7292876a5]
python(_start+0x29)[0x400769]
------------------------------------------------------------------------
Unhandled SIGABRT: An abort() occurred.
This probably occurred because a *compiled* module has a bug
in it and is not properly wrapped with sig_on(), sig_off().
Python will now terminate.
------------------------------------------------------------------------

while the underlying operation only aborts with RuntimeError:

gcd(x,x^1431655765-1)
Exception (FLINT memory_manager). Unable to allocate memory.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-468b83720c84> in <module>()
      1 
----> 2 gcd(x,x**Integer(1431655765)-Integer(1))

/usr/local/src/sage-config/local/lib/python2.7/site-packages/sage/arith/misc.pyc in gcd(a, b, **kwargs)
   1576     if b is not None:
   1577         try:
-> 1578             return a.gcd(b, **kwargs)
   1579         except (AttributeError, TypeError):
   1580             pass

/usr/local/src/sage-config/src/sage/symbolic/expression.pyx in sage.symbolic.expression.Expression.gcd (build/cythonized/sage/symbolic/expression.cpp:41191)()
   6888         cdef Expression r = self.coerce_in(b)
   6889         cdef GEx x
-> 6890         sig_on()
   6891         try:
   6892             x = g_gcd(self._gobj, r._gobj)

RuntimeError: Aborted

The normalize member function is one of many where signals from underlying layers are not caught. Let's do a sweeping fix of them in this ticket.

Change History (42)

comment:1 Changed 5 years ago by Jeroen Demeyer

Component: symbolicscython
Description: modified (diff)

In the future, please remember to POST TRACEBACKS such that people reading this ticket can actually understand what it is about.

comment:2 Changed 5 years ago by Jeroen Demeyer

Description: modified (diff)

comment:3 Changed 5 years ago by Ralf Stephan

Branch: u/rws/bulk_fix_of_signal_handling_in_symbolics

comment:4 Changed 5 years ago by Ralf Stephan

Authors: Ralf Stephan
Commit: 8238d61741a6d2f85f2c5eacf77c8b6952437120
Status: newneeds_review

New commits:

8238d6123925: Bulk fix of signal handling in symbolics

comment:5 Changed 5 years ago by Jeroen Demeyer

Status: needs_reviewneeds_work

Don't put Python code inside sig_on()/sig_off() blocks.

What you are doing is essentially

sig_on()
try:
    initialize()
    res = c_lib_call()
    return process(res)
finally:
    sig_off

Instead, it should be

initialize()
sig_on()
res = c_lib_call()
sig_off()
return process(res)

comment:6 Changed 5 years ago by Jeroen Demeyer

...or, if C++ exceptions can occur, it should be:

initialize()
sig_on()
try:
    res = c_lib_call()
finally:
    sig_off()
return process(res)

comment:7 Changed 5 years ago by Jeroen Demeyer

After thinking about it a bit more... pynac is partially implemented in Python, right? So maybe these should be added upstream in Pynac, not in Sage.

comment:8 Changed 5 years ago by Ralf Stephan

Pynac is completely C++ but calls some Python C API functions. Is there a C function sig_on?

comment:9 Changed 5 years ago by Jeroen Demeyer

In theory, cysignals can be interfaced via C/C++ but I don't think that this has been done already.

comment:10 Changed 5 years ago by Ralf Stephan

I don't think it belongs in Pynac. Several functions are called that are used by Pynac itself. Rather, if you insist on adding a code layer, it should go in the Pynac interface, e.g. libs/pynac/wrap.h.

comment:11 in reply to:  10 ; Changed 5 years ago by Jeroen Demeyer

Replying to rws:

if you insist on adding a code layer

I'm not. I'm saying to put sig_on() at the correct layer (which is whatever layer is calling C library functions), not to add a new layer.

comment:12 Changed 5 years ago by Ralf Stephan

If it goes to correctness:

"...it is possible to declare a function as potentially raising an C++ exception and converting it into a Python exception. For example,

cdef extern from "some_file.h":
    cdef int foo() except +

This will translate try and the C++ error into an appropriate Python exception (currently an IndexError? on std::out_of_range and a RuntimeError? otherwise (preserving the what() message)."

See https://github.com/cython/cython/wiki/WrappingCPlusPlus#advanced-c-features

The relevant methods all have their "except +" in libs/pynac/pynac.pxd so why do we need the sig_on/off at all?

comment:13 in reply to:  12 ; Changed 5 years ago by Jeroen Demeyer

Replying to rws:

why do we need the sig_on/off at all?

  1. To allow interrupting a long-running operation.
  1. For error handling from low-level C libraries like FLINT (which is what this ticket is about).

None of these reasons have anything to do with C++ exceptions, so I wonder why you brought that up.

What is true is that cysignals was not written with C++ in mind. Maybe one could reimplement parts of cysignals to be more C++-friendly, for example by raising C++ exceptions (which are then converted to Python exceptions by Cython).

comment:14 in reply to:  13 ; Changed 5 years ago by Ralf Stephan

Replying to jdemeyer:

None of these reasons have anything to do with C++ exceptions, so I wonder why you brought that up.

It's the same layer and doing it automatically (maybe if a similar directive to except + is given) prevents tickets such as this.

comment:15 in reply to:  14 ; Changed 5 years ago by Jeroen Demeyer

Replying to rws:

It's the same layer and doing it automatically prevents tickets such as this.

Doing what automatically?

comment:16 in reply to:  15 Changed 5 years ago by Ralf Stephan

More exactly. There are two possibilities, a signal handler is set when leaving Python (like sig_on/off) or one is set when leaving C++ and calling FLINT. The latter would be in Singular not Pynac however. With this I tend to call it a Singular/libfactory bug.

comment:17 Changed 5 years ago by Jeroen Demeyer

It should be set when leaving Python. However, Pynac is a difficult case because it mixes C++ and Python. So there is no clear Python/C++ boundary.

comment:18 Changed 5 years ago by Ralf Stephan

In any case cysignals has a tested implementation. Is it possible at all to reuse it from C++?

comment:19 Changed 5 years ago by Ralf Stephan

Ah I see, src/sage/rings/padics/transcendantal.c (sic) uses cysignals.

comment:20 Changed 5 years ago by Ralf Stephan

Branch: u/rws/bulk_fix_of_signal_handling_in_symbolicsu/rws/23925-test

comment:21 Changed 5 years ago by Ralf Stephan

Commit: 8238d61741a6d2f85f2c5eacf77c8b69524371200d7e908eec43c7cdb7462b39c2241fe73a734b26
Status: needs_workneeds_info

This test branch implements calling of sig_on/off from the Pynac interface, just for the ticket case. While C++ runtime errors are caught, the ticket case's SIGABRT is not (while it is caught in the first uploaded branch where the sig_on/off is called from Python).

How to proceed?


New commits:

0d7e908test

comment:22 Changed 5 years ago by Ralf Stephan

Exception (FLINT memory_manager). Unable to allocate memory.
------------------------------------------------------------------------
/home/ralf/sage/local/lib/python2.7/site-packages/cysignals/signals.so(+0x5d55)[0x7f7af2ecfd55]
/home/ralf/sage/local/lib/python2.7/site-packages/cysignals/signals.so(+0x5da5)[0x7f7af2ecfda5]
/home/ralf/sage/local/lib/python2.7/site-packages/cysignals/signals.so(+0x8c2a)[0x7f7af2ed2c2a]
/lib64/libpthread.so.0(+0x10b10)[0x7f7afaebcb10]
/lib64/libc.so.6(gsignal+0x37)[0x7f7afa4398d7]
/lib64/libc.so.6(abort+0x13a)[0x7f7afa43acaa]
/home/ralf/sage/local/lib/libflint.so.13(+0x59194)[0x7f7ae9ffe194]
/home/ralf/sage/local/lib/libflint.so.13(+0x5f468)[0x7f7aea004468]
/home/ralf/sage/local/lib/libflint.so.13(_fmpz_poly_gcd_heuristic+0x6aa)[0x7f7aea06a29a]

comment:23 Changed 5 years ago by Jeroen Demeyer

The reason for your problem is that Cython supports either checking for C++ exceptions (except +) or Python exceptions (except something_else), but not both at the same time.

But as I already said: no additional layer is needed, you can just add the sig_on/sig_off in the Cython file directly.

comment:24 in reply to:  11 ; Changed 5 years ago by Ralf Stephan

Replying to jdemeyer:

Replying to rws:

if you insist on adding a code layer

I'm not. I'm saying to put sig_on() at the correct layer (which is whatever layer is calling C library functions), not to add a new layer.

The layer that is calling C library functions is the C++ layer (either expression.pyx translated to C++ or Singular or Pynac, all of it happens). You just said to insert the insert sig_on() in the Cython file. You mean expression.pyx, that is, just like I did in the first branch, right?

comment:25 in reply to:  24 Changed 5 years ago by Jeroen Demeyer

Replying to rws:

The layer that is calling C library functions is the C++ layer

Sorry, when I said "C" I meant "C or C++". For cysignals, C or C++ is the same thing. It's just the separation Python <-> C/C++ that matters.

You mean expression.pyx, that is, just like I did in the first branch, right?

Yes, like in your first branch. However, your branch had many wrong usages like

sig_on()
Python code
Pynac call
Python code
sig_off()

That is certainly wrong. It should be

Python code
sig_on()
Pynac call
sig_off()
Python code

comment:26 Changed 5 years ago by Jeroen Demeyer

The code calling normal() should look like

cdef GEx result
sig_on()
try:
    result = self._gobj.normal(0, False, True)
finally:
    sig_off()
return new_Expression_from_GEx(self._parent, result)

Note that _gobj is a cdef attribute, so self._gobj involves no Python code. Therefore, it is safe to put inside the sig_on() block. On the other hand, new_Expression_from_GEx() is unsafe since it invokes Python code like cls.__new__(cls) for example.

comment:27 Changed 5 years ago by Ralf Stephan

Branch: u/rws/23925-testu/rws/23925-1

comment:28 Changed 5 years ago by Ralf Stephan

Commit: 0d7e908eec43c7cdb7462b39c2241fe73a734b26d3679562368021af54af8e3cff8673b3b0c0012e
Status: needs_infoneeds_review

New commits:

d36795623925: Bulk fix of signal handling in symbolics

comment:29 Changed 5 years ago by Jeroen Demeyer

The evalf() code is more complicated. Maybe don't add the sig_on() there for now.

And it would be more efficient to replace

    def _rel_equal2(self, right):

by

    cdef bint _rel_equal2(Expression self, Expression other) except -1:

This way, you don't need the cdef Expression l, r variables.

comment:30 Changed 5 years ago by Jeroen Demeyer

Status: needs_reviewneeds_work

comment:31 Changed 5 years ago by git

Commit: d3679562368021af54af8e3cff8673b3b0c0012e732c03f66bbe3cffea7600a585f2bb7cab1ec533

Branch pushed to git repo; I updated commit sha1. New commits:

732c03f23925: address reviewer's comments

comment:32 Changed 5 years ago by Ralf Stephan

How come I get SCORE src/sage/symbolic/expression.pyx: 100.0% (232 of 232) from sage -coverage but the patchbot plugin fails? I'm now filling this file with useless characters just for the sake...

comment:33 Changed 5 years ago by git

Commit: 732c03f66bbe3cffea7600a585f2bb7cab1ec5333fba90f76e57a7bda6b45e3305283f70df135c15

Branch pushed to git repo; I updated commit sha1. New commits:

3fba90f23925: please the coverage plugin

comment:34 Changed 5 years ago by Ralf Stephan

Status: needs_workneeds_review

comment:35 in reply to:  32 Changed 5 years ago by Jeroen Demeyer

Replying to rws:

I'm now filling this file with useless characters just for the sake...

Please don't! Use your own judgement. You are smarter than the coverage plugin.

comment:36 Changed 5 years ago by Ralf Stephan

I'll leave it as is now; declaring something a helper function is not so bad. Patchbot is go. Please review.

comment:37 Changed 5 years ago by Jeroen Demeyer

Status: needs_reviewneeds_work

Merge conflict

comment:38 Changed 5 years ago by git

Commit: 3fba90f76e57a7bda6b45e3305283f70df135c15b4a620d01d122b254b119dd4fa0b9573c598f8f0

Branch pushed to git repo; I updated commit sha1. New commits:

b4a620dMerge branch 'develop' into t/23925/23925-1

comment:39 Changed 5 years ago by Ralf Stephan

Status: needs_workneeds_review

comment:40 Changed 5 years ago by Jeroen Demeyer

Reviewers: Jeroen Demeyer

Good for me if this passes patchbot testing.

comment:41 Changed 5 years ago by Ralf Stephan

Status: needs_reviewpositive_review

Passes patchbots, thanks.

comment:42 Changed 5 years ago by Volker Braun

Branch: u/rws/23925-1b4a620d01d122b254b119dd4fa0b9573c598f8f0
Resolution: fixed
Status: positive_reviewclosed
Note: See TracTickets for help on using tickets.