Opened 5 years ago
Closed 5 years ago
#23352 closed defect (fixed)
Fix random matrix_gfpn_dense
Reported by: | Simon King | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.1 |
Component: | packages: optional | Keywords: | meataxe random, days88, IMA coding sprints |
Cc: | Merged in: | ||
Authors: | Simon King | Reviewers: | Jeroen Demeyer, Travis Scrimshaw |
Report Upstream: | N/A | Work issues: | |
Branch: | 39431d7 (Commits, GitHub, GitLab) | Commit: | 39431d7e98a19de1c3b4f0af97440922eed10f44 |
Dependencies: | Stopgaps: |
Description (last modified by )
I just noticed that, when meataxe is installed and its Sage wrapper is used for matrices over GF(p^n)
with p
odd and p^n<255
, then at least in some cases the random matrices aren't sufficiently random:
sage: random_matrix(GF(9,'x'), 5) [ x 2*x + 1 x + 2 x + 2 0] [ x + 1 x x + 2 x + 1 0] [ x + 1 2 x + 1 2 0] [ x 2*x x x + 1 0] [2*x + 1 0 2*x + 1 2*x + 2 0] sage: random_matrix(GF(9,'x'), 5) [ 2 x + 1 x + 1 2*x + 2 0] [ x 2*x + 2 x + 2 2*x 0] [ x + 2 2*x 0 2*x + 2 0] [ 2*x 2 2 1 0] [ x + 2 2 2*x 1 0] sage: random_matrix(GF(9,'x'), 5) [ 2 2*x + 1 x 2*x + 1 0] [2*x + 2 0 2*x + 2 x 0] [ 2 x 2*x + 1 x + 2 0] [ 0 2*x + 1 x x 0] [ 1 1 2 x + 1 0] sage: random_matrix(GF(9,'x'), 5) [ x x + 1 0 0 0] [ 1 x + 1 0 2*x + 1 0] [ 2 0 x x + 2 0] [ 2 0 x + 1 x + 2 0] [ x x + 1 2*x 1 0]
So, the last column always is zero, which shouldn't be the case.
Change History (35)
comment:1 Changed 5 years ago by
Branch: | → u/SimonKing/fix_random_matrix_gfpn_dense |
---|
comment:2 follow-up: 20 Changed 5 years ago by
Authors: | → Simon King |
---|---|
Commit: | → 3297b211937dd687bd8b0fbf267ac7b52e171022 |
Status: | new → needs_review |
comment:3 Changed 5 years ago by
Dependencies: | → #21437 |
---|---|
Status: | needs_review → needs_work |
comment:4 Changed 5 years ago by
Commit: | 3297b211937dd687bd8b0fbf267ac7b52e171022 → 6e332268227e5f8abedce6259bb32f8765108df0 |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:
291f4dc | Some minor tweaks for performance
|
03ce5c0 | Fix bad INPUT::/OUTPUT:: blocks
|
5ed2269 | Fix one doctest
|
d4dc794 | Merge branch 'develop' into t/21437/fix-mtx-matrices
|
413e4a9 | Remove deprecated Cython syntax, use check_malloc
|
d943f93 | Remove rawMatrix function
|
c929fcc | Remove unused MTX* environment variables
|
b70bc5f | Change some str into bytes
|
04e7986 | Improve use of sig_on/off/check
|
6e33226 | Properly fill the last few columns of random meataxe matrices
|
comment:5 Changed 5 years ago by
Status: | needs_work → needs_review |
---|
Don't be scared. All but the last commit belong to #21437. Yes, I still believe that git commits *belong* to a ticket...
Anyway, I think it is logical to put a ticket that fixes random meataxe matrices on top of a ticket that fixes other aspects of meataxe matrices.
comment:6 follow-up: 11 Changed 5 years ago by
Description: | modified (diff) |
---|---|
Summary: | Fix random matrix_gfpn_dense → Some bug fixes for matrix_gfpn_dense |
Jeroen suggested on #21437 to distribute different aspects of the work in a new way. I think this ticket fits best for the topic "bug fixes". So, I'll rename the ticket a couple of commits. The tests should pass with the branch that I'll force-push in a minute.
comment:7 Changed 5 years ago by
Commit: | 6e332268227e5f8abedce6259bb32f8765108df0 → 225afe1afa8dfa4ab700d3d91d3dca956e2b6ce0 |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
b1aef56 | Add MeatAxe patch that ensures correct value of FfCurrentRowSizeIo
|
a44a58a | Fix reading mtx-matrix from non-existent file
|
91ac93f | Make pickling machine independent
|
225afe1 | Properly fill the last few columns of random meataxe matrices
|
comment:9 Changed 5 years ago by
Commit: | 225afe1afa8dfa4ab700d3d91d3dca956e2b6ce0 → bd1f3f13c7a5a5a2af51dd2b19f976515b0f3fd9 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
bd1f3f1 | Fix ticket number in deprecation warning
|
comment:10 Changed 5 years ago by
Dependencies: | #21437 |
---|
comment:11 follow-up: 12 Changed 5 years ago by
comment:12 Changed 5 years ago by
Replying to jdemeyer:
Replying to SimonKing:
Jeroen suggested on #21437 to distribute different aspects of the work in a new way.
That's not quite what I said. I proposed to split up #21437. Adding more stuff to existing tickets (like this one) is actually the opposite of what I intented.
Do you prefer that I rebase again and split this ticket into three: One for the new MeatAxe patch (that avoids frequent calls to a C function and thus gives a speedup), one for the pickling and one for random matrices?
comment:14 follow-up: 16 Changed 5 years ago by
Replying to jdemeyer:
Yes, that would be a good idea.
The first one is #23410.
The patches to be distributed are, as much as I see, independent. Would it be better to let the new tickets build upon each other (i.e., the git history of one ticket is included in the git history of the other tickets), or should they stay separate (i.e., each ticket provides a patch, and then they are merged only when needed)?
comment:15 Changed 5 years ago by
Description: | modified (diff) |
---|---|
Status: | needs_review → needs_work |
Summary: | Some bug fixes for matrix_gfpn_dense → Fix random matrix_gfpn_dense |
comment:16 Changed 5 years ago by
Replying to SimonKing:
The patches to be distributed are, as much as I see, independent. Would it be better to let the new tickets build upon each other (i.e., the git history of one ticket is included in the git history of the other tickets), or should they stay separate (i.e., each ticket provides a patch, and then they are merged only when needed)?
The ticket that is about sanitizing sig_on/off/check and about proper Cython code style has to depend on the other tickets. But I suggest to keep the other tickets separate as much as possible, and will soon push branches accordingly.
comment:17 Changed 5 years ago by
Commit: | bd1f3f13c7a5a5a2af51dd2b19f976515b0f3fd9 → bbd6a61761e4c53bd90cef5d9685648dc9012879 |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
bbd6a61 | Properly fill last column of random matrix_gfpn_dense
|
comment:18 Changed 5 years ago by
Status: | needs_work → needs_review |
---|
comment:19 Changed 5 years ago by
In the splitup process, I have already updated #23410, #23411 and #23352, making them small and independent.
About the other parts, I plan this:
- We have #21437 for proper initialisation. I plan to try to make this stand-alone as well, but only if it cleanly merges with the afore-mentioned tickets. If it doesn't merge, I'll put it on top of the other tickets.
- We have #23400 for code cleanup. The cleanup will also concern parts of code that are touched by the other tickets. So, it will use them as dependency.
- We have #23399 for some additions (new methods). I very much doubt that it can stay separate, but I'll try. Very likely #23399 will be on top of all other tickets.
Jeroen, do you support my plan?
comment:20 follow-up: 21 Changed 5 years ago by
Replying to SimonKing:
The matrix entries are densely packed in meataxe. Hence, in some situations, the last byte of a row in memory is only partly filled with data. And in fact the HIGHEST bits of the last byte have to be filled.
The problem was: In the old code, the LOWEST bits of the last byte have been filled.
If I'm reading the documentation at http://www.math.rwth-aachen.de/~MTX/doc24/group__ff.html correctly, it should be the LOWEST bytes:
Packing of field elements is used for field orders less than 17. Let q be the field order and m the largest natural number with qm≤256. Then, m elements k0,...kn-1 are packed into one byte as k0+k1q+k2q2+...
comment:21 Changed 5 years ago by
Replying to jdemeyer:
If I'm reading the documentation at http://www.math.rwth-aachen.de/~MTX/doc24/group__ff.html correctly, it should be the LOWEST bytes:
Packing of field elements is used for field orders less than 17. Let q be the field order and m the largest natural number with qm≤256. Then, m elements k0,...kn-1 are packed into one byte as k0+k1q+k2q2+...
Yes, that's what I had in mind, too, and is why I wrote the old code in the way I did. Anyway, I am now using meataxe's FfInsert
, which should do the right thing.
Let's test: If I would set the last byte inconsistent with meataxe's conventions, then multiplying a matrix with the transposed matrix (these are all using meataxe functions) would give a wrong result.
sage: MS = MatrixSpace(GF(9,'x'),1,5) sage: def sanity_check(M): ....: assert (M*M.transpose())[0,0] == add(e*e for e in M.list()) ....: sage: while(1): ....: sanity_check(MS.random_element()) ....:
works without error. Note that in the setting above, the new code is called. Thus, I believe my code is correct and indeed the meataxe documentation provides a wrong statement.
comment:22 Changed 5 years ago by
While fixing the randomize
method, I'd like to speed it up, by cdefining the randint function.
comment:23 Changed 5 years ago by
Commit: | bbd6a61761e4c53bd90cef5d9685648dc9012879 → 71abd443f2690b8fa20dfc3d177dec03155a7264 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
71abd44 | Speedup meataxe matrix randomisation
|
comment:24 Changed 5 years ago by
Before the latest commit I got
sage: MS = MatrixSpace(GF(9,'x'),5,17) sage: %timeit MS.random_element() The slowest run took 96.98 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 5.37 µs per loop
With the latest commit, I get
sage: MS = MatrixSpace(GF(9,'x'),5,17) sage: %timeit MS.random_element() The slowest run took 142.72 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 3.4 µs per loop
I think 35% gain is worth it. Also, since I'm touching the code anyway, I change to using range() in Cython.
comment:25 Changed 5 years ago by
PS: Since I am changing the code anyway, I guess I could also change the use of sig_on/off in that part of the code...
comment:26 Changed 5 years ago by
Commit: | 71abd443f2690b8fa20dfc3d177dec03155a7264 → e68140e57f5eb45d7482bd50533ce2d58f99fad6 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
e68140e | Replace sig_on/off by sig_check in meataxe matrix randomisation
|
comment:27 follow-up: 28 Changed 5 years ago by
After replacing sig_on/off by sig_check, I get
sage: MS = MatrixSpace(GF(9,'x'),5,17) sage: %timeit MS.random_element() The slowest run took 146.35 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 3.05 µs per loop
That's further 10% on top of the previous 35%.
That's slightly surprising to me. Is sig_on/off really resulting in a massive slow-down? Then why is it used? After all, another disadvantage is that it doesn't play well with Python code.
comment:28 Changed 5 years ago by
Replying to SimonKing:
Is sig_on/off really resulting in a massive slow-down?
It shouldn't result in a massive slowdown. However, if you are using it in a very tight loop, the time taken by sig_on()
is noticable.
comment:29 Changed 5 years ago by
I should also mention that sig_on()
is essentially just a call to sigsetjmp()
. This is a function implemented by libc
from the OS. So the time taken by sig_on()
could depend significantly on the system.
comment:31 follow-up: 33 Changed 5 years ago by
Status: | needs_review → needs_work |
---|
The branch does not apply on latest beta.
comment:32 Changed 5 years ago by
Commit: | e68140e57f5eb45d7482bd50533ce2d58f99fad6 → 39431d7e98a19de1c3b4f0af97440922eed10f44 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
39431d7 | Merge branch 'develop' into t/23352/fix_random_matrix_gfpn_dense
|
comment:33 Changed 5 years ago by
Status: | needs_work → needs_review |
---|
comment:34 Changed 5 years ago by
Keywords: | days88 IMA coding sprints added |
---|---|
Milestone: | sage-8.0 → sage-8.1 |
Reviewers: | → Jeroen Demeyer, Travis Scrimshaw |
Status: | needs_review → positive_review |
Let it be so.
comment:35 Changed 5 years ago by
Branch: | u/SimonKing/fix_random_matrix_gfpn_dense → 39431d7e98a19de1c3b4f0af97440922eed10f44 |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
The matrix entries are densely packed in meataxe. Hence, in some situations, the last byte of a row in memory is only partly filled with data. And in fact the HIGHEST bits of the last byte have to be filled.
The problem was: In the old code, the LOWEST bits of the last byte have been filled.
I fixed it by explicitly assigning random numbers to the last few entries (instead of trying to fill a whole block of entries).
New commits:
Properly fill the last few columns of random meataxe matrices