#5098 closed enhancement (fixed)
Pollard rho algorithm for generic discrete logarithm
Reported by: | ylchapuy | Owned by: | tbd |
---|---|---|---|
Priority: | minor | Milestone: | sage-4.3 |
Component: | algebra | Keywords: | |
Cc: | cremona | Merged in: | sage-4.3.rc1 |
Authors: | Yann Laigle-Chapuy | Reviewers: | Martin Raum |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
First attempt to provide an algorithm using less memory than the "baby step giant step" algorithm for generic discrete logarithm.
This algorithm uses only (small) constant memory.
Proposed patch attached, with doctests.
Attachments (4)
Change History (41)
comment:1 Changed 13 years ago by
comment:2 follow-up: ↓ 3 Changed 13 years ago by
I don't really know what makes it slow. I suspect the use of "hash" is costly but I see no other way to partition generically a group in approximately equal size chunks.
comment:3 in reply to: ↑ 2 ; follow-up: ↓ 4 Changed 13 years ago by
Replying to ylchapuy:
I don't really know what makes it slow. I suspect the use of "hash" is costly but I see no other way to partition generically a group in approximately equal size chunks.
That is probably right. It might be worth putting in a comment (or a docstring sentence aimed at future developers) saying that the efficiency of the functions depends on having a fast hash.
comment:4 in reply to: ↑ 3 Changed 13 years ago by
Replying to cremona:
Replying to ylchapuy:
I don't really know what makes it slow. I suspect the use of "hash" is costly but I see no other way to partition generically a group in approximately equal size chunks.
That is probably right. It might be worth putting in a comment (or a docstring sentence aimed at future developers) saying that the efficiency of the functions depends on having a fast hash.
patch modified to adress the above comment. The hash function is now a parameter of the pollard_rho function to give future developers an easy way to customize it.
comment:5 Changed 13 years ago by
- Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm
I just wrote a review for this but trac refused to save it since another comment has been made in between. How can it do this!!!! Luckily I was able to use the "back" button to revover what I had written.
Review:
- Patch applied fine to 3.3.alpha2. Doctests in groups/generic pass.
- In the docstring for pollard_rho() I would move the explanation for the partition_size parameter to a NOTE later, not in the description of the paramater; and that description should say that the default is 16.
- There should be doctests in discrete_log() showing the use of the algorithm parameter.
I also think that it would be helpful to add to this ticket (and perhaps also in doctests) examples showing how the new algorithm is sometimes preferable!
None of these is at all serious, and think this additoin is basically very good.
comment:6 Changed 13 years ago by
- Summary changed from [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm
a patch is on the way. I also removed to debug printings I forgot and added my authorship.
Concerning comparison, the benefit of Pollard rho algorithm comes only with quite big examples and I don't think it's a good idea for doctests. Here's one example.
---------------------------------------------------------------------- | Sage Version 3.2.3, Release Date: 2009-01-05 | | Type notebook() for the GUI, and license() for information. | ---------------------------------------------------------------------- Loading Sage library. Current Mercurial branch is: yann sage: get_memory_usage() 113.9765625 sage: F.<a>=GF(2^118) sage: g=F.multiplicative_generator() sage: u=g^15726718062461913153073925660344578 sage: time discrete_log(u,g,algorithm="rho") CPU times: user 56.06 s, sys: 0.20 s, total: 56.26 s Wall time: 56.39 s 15726718062461913153073925660344578 sage: get_memory_usage() 116.03515625 sage: time discrete_log(u,g,algorithm="bsgs") CPU times: user 46.74 s, sys: 0.54 s, total: 47.29 s Wall time: 47.47 s 15726718062461913153073925660344578 sage: get_memory_usage() 392.89453125
actually, Pollard rho is not so slow :)
comment:7 Changed 13 years ago by
argh, sorry, trac-5098-review.2.patch are the same trac-5098-review.patch, only one of them needs to be applied
comment:8 Changed 13 years ago by
if someone reviews this patch, he might have a look at #5112 implementing Pollard lambda algorithm
comment:9 Changed 13 years ago by
- Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with review, a little more work] Pollard rho algorithm for generic discrete logarithm
I am having trouble applying the new patch. I have a 3.3.alpha2 build, so #5088 is already merged. I apply trac-5098.patch ok. Then I apply trac-5098-review.patch but get several hunks failing.
This might just be me being stupid, so it would be good if someone else tried.
While I am here though, you did not quite understand my third point in the original review: I did not mean that a new doctest should illustrate the greater efficiency (or not) of the new algorithm, but it should at least illustrate how to use the new parameter. e.g. take one of the existing examples and compute it bothe ways and check that they are equal:
sage: F.<a>=GF(2^63) sage: g=F.gen() sage: u=g**123456789 sage: discrete_log(u,g,algorithm='bsgs') == discrete_log(u,g,algorithm='rho') True
comment:10 Changed 13 years ago by
- Summary changed from [with patch, with review, a little more work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm
brand new standalone patch, based on a fresh alpha2 build. I hope everything is ok this time :)
Changed 13 years ago by
comment:11 Changed 13 years ago by
- Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm
Thanks -- I am quite happy now, and give this a positive review! Now I will go and look at the lambda patch too.
comment:12 Changed 13 years ago by
- Summary changed from [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review, needs doctest fix] Pollard rho algorithm for generic discrete logarithm
The latest patch causes the following doctest failure: From
File "/scratch/mabshoff/sage-3.3.alpha4/devel/sage/sage/rings/integer_mod.pyx", line 444: sage: sage.groups.generic.discrete_log(a,b) Expected: Traceback (most recent call last): ... ZeroDivisionError: Inverse does not exist.
we get to
ArithmeticError: multiplicative order of 4 not defined since it is not a unit modulo 100
William doesn't like either one, but he is about to comment himself.
Cheers,
Michael
comment:13 Changed 13 years ago by
William: Hmm. Better would be a statement that the discrete log doesn't exist. Or that it might but the algorithm implemented can't tell. I don't like either error. me: Well, I will comment on the ticket and then you can comment there on the record. William: The log does exist. so it would be much better to get a NotImplementedError when things go wrong.
Cheers,
Michael
comment:14 Changed 13 years ago by
Hi guys,
this ticket and #5112 is being held up by the above doctesting issues. The patches should go into 3.3 and rc0 will drop in roughly 48 hours, so there is time to sort this out. We could do two things:
- change the doctest to "fix" the problem for now and open a followup ticket to deal with the problem William pointed out. Since the situation doesn't seem to get worst that seems acceptable
- fix the problem William pointed out :)
Thoughts?
Cheers,
Michael
comment:15 Changed 13 years ago by
I don't have time to look into this and hope that ylchapuy will. I don't actually think it is the Pollard rho code at fault since by default generic discrete log still uses bsgs, so it is more likely to be the previous (and already merged) improvement to the discrete log which ylchapuy added a week or two ago.
If I finish all the "real" work I must do before tomorrow I'll come back to this but that is not very likely...
comment:16 Changed 13 years ago by
I would suggest to fix the doctest for the moment, and open a ticket for an enhancement. For the moment, we only allow discrete logarithms computations in groups with known order; I think it could be possible quite easily to make pollard rho algorithm to work for unknown order too but it needs some work. I definitely have no time today, and no clean install of recent sage to patch the doctest myself ontime...
Changed 13 years ago by
comment:17 Changed 13 years ago by
added a small patch to fix doctest, and return a NotImplementedError? explaining the problem which is that we can't get the order of the base (in the example it's not a unit).
comment:18 Changed 13 years ago by
- Summary changed from [with patch, with positive review, needs doctest fix] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm
comment:19 Changed 13 years ago by
- Milestone changed from sage-3.4.1 to sage-3.3
- Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm
the last patch only change the behaviour in case the order of the base can't be computed. If all the patches apply cleanly I think it doesn't need another review (correct me if I'm wrong).
comment:20 Changed 13 years ago by
- Summary changed from [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review, doctest fix needs review] Pollard rho algorithm for generic discrete logarithm
comment:21 Changed 13 years ago by
I was intending to review this (again), but there are a lot of patches and I do not know which ones to apply. All?
Regarding the last doctest issue, the function being tested states in its docstring that the arguments should be units, so surely it would be more sensible to test that and return an error if not, rather than go ahead and call any of the generic discrete log functions (which are placed in sage/GROUPS/generic.py for a reason: they require a group and were written that way, so inverses would exist). Just because 4 happens to be a power of 2 mod 100 does not mean that I should be forced to apologise for not returning an answer in that case. I know of no reason anyone might ever have for implementing discrete logs for non-units in a non-integral domain.
comment:22 follow-up: ↓ 23 Changed 13 years ago by
First: The patches to apply are the last two only; is there a way to remove the other ones?
Secondly: I do agree, discrete logarithm is intended only for groups. The changes in my last patch might still be useful. I test if we can compute the order of the base, because the algorithms we use for now need them. It's probably not a good idea though to treat in the same way a non unit, and an element of a group where order is not implemented...
What should I do to make those patches the right way ?
comment:23 in reply to: ↑ 22 Changed 13 years ago by
Replying to ylchapuy:
First: The patches to apply are the last two only; is there a way to remove the other ones?
Right now only trac admins can delete patches. I will delete all but the two last patches shortly.
Some times it is interesting for credit and historic purposes to keep old patches around, but in this case it should be no problem to delete them.
Cheers,
Michael
comment:24 Changed 13 years ago by
- Summary changed from [with patch, with positive review, doctest fix needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review, doctest fix needs work] Pollard rho algorithm for generic discrete logarithm
I have some issues with trac-5098-doctest.patch:
- There should be a doctest that illustrates every single exception that can be raised. For example this patch adds
raise NotImplementedError, "The current implementation of Pollard Rho algorithm needs to have (a multiple of) the order of the base"
on line 543, but there is no doctest that illustrates this.
- Line 758, The %s"%base business is bad.
raise NotImplementedError, "we currently need to know the order of the base %s"%(base) }}} I used to write a lot of exceptions like this, since for Magma we did that. But in Magma, they aren't exceptions that can be caught -- any single exception stops the program. In Sage, exceptions can be caught and occur as part of the normal flow of execution of a program. Once in 2006 there was a simple calculation David Harvey was doing, and he was very confused because it was taking several minutes to do something trivial. It turned out this was because in the course of the computation an exception was raised and caught (e.g., because of some coercion model stuff), and the exception was written like the one before -- it was a string that printed the element involved in the coercion, and the str(...) conversion of the element took *minutes*. That was what dominated the calculation. As a result I realized that I had completely messed up in using that pattern for implementing exceptions in Sage, and had to go through the whole codebase and rewrite all such exceptions. Just to hammer this home, consider {{{ sage: a = Mod(16, 100); b = Mod(-4,10^10^7) sage: try: w = sage.groups.generic.discrete_log(a,b) except: pass [hang forever!] }}} and if it ever did come back, imagine what that exception would look like!
comment:25 Changed 13 years ago by
Here's some irc chatter about the "%s"%base exception business. There are perhaps thousands (?) of instances of this in Sage now.
23:46 < wstein> They can be "silent killers" :-) 23:46 < mabs> wstein: can an opinion on the broken combinat pickles? 23:47 < mabs> can -> got 23:48 < wstein> wstein@sage:/space/wstein/sage-3.3.rc0$ ./sage -grep "raise" "%s" |wc -l 23:48 < wstein> 6750 23:48 < mabs> *OUCH* 23:48 < wstein> It's not always bad. 23:48 < wstein> wstein@sage:/space/wstein/sage-3.3.rc0$ ./sage -grep "raise" "%s" |wc -l 23:48 < wstein> Oops. 23:49 < wstein> sage: search_src("raise","%s") # also shows them. 23:49 < wstein> It's only the ones that have elements that could easily be large in the exception that 23:49 < wstein> are a danger. 23:49 < wstein> But a quick glance suggests there are probably thousands of these still. 23:50 < mabs> Yes, I thought the comment was good since I never knew about that problem. 23:50 < wstein> Especially in the combinat and coding theory code... 23:50 < mabs> :) 23:50 < wstein> One can just do search_src("raise","%s") 23:50 < wstein> look to see which exceptions are funny, and quickly craft examples where 23:50 < wstein> things seem to hang for stupid reasons instead of raising an exception. 23:51 < wstein> Where hang = create a base-10 string rep for something in memory, then try to display a bazillion characters to screen.
comment:26 Changed 13 years ago by
A comment by cwitty about my suggestion that every exception should be doctested:
00:07 < cwitty> wstein: re your comment on #5098: I've added exceptions to Sage that I have no idea how to doctest. 00:07 < wstein> cwitty -- for that patch I think they would all be easy. 00:07 < wstein> I had that in mind. 00:07 < cwitty> I sort of suspect that hitting them may require working in a number field of degree several billion, but I'm not positive. 00:07 < wstein> cwitty -- at least you can give a reason for not doctesting the ones you don't test. 00:08 < wstein> I don't mean for that to be some absolute rule. 00:08 < wstein> Just a general guideline. 00:08 < wstein> But you have a good point. 00:08 < wstein> I'll paste this to the ticket. 00:09 < wstein> The last thing I wrote in sage was something involving gap.py and the "This should never happen" exception. 00:09 < wstein> That can't be doctested :-) 00:09 < wstein> So you're right.
comment:27 Changed 13 years ago by
- Milestone changed from sage-3.3 to sage-3.4.1
I am cleaning up the 3.3 milestone. If a patch with positive review is put up to this ticket on time it might make it into 3.3.
Cheers,
Michael
comment:28 Changed 13 years ago by
- Summary changed from [with patch, with positive review, doctest fix needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs work] Pollard rho algorithm for generic discrete logarithm
Any movement on this?
Cheers,
Michael
comment:29 Changed 13 years ago by
- Cc cremona added
Let's try another way. The last patch I uploaded is to be applied instead of the previous ones. It only provide a Pollard rho algorithm for discrete log, without plugin it in the existing discrete log. We can address the other problems about non unit elements for example in another ticket.
comment:30 Changed 13 years ago by
- Summary changed from [with patch, needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm
This looks ok to me. It applies fine to 4.1.alpha3 and tests pass. It might be better to call the function discrete_log_rho instead of pollard_rho_log as that would make it easier to find via tab completion on discrete_*.
There should also be examples (doctests) to illustrate the use of the parameters partition_size, memory_size, hash_function, check_prime. For example, one where using a non-default value gives better (i.e. faster) results. If it is not practical to put those into doctests (for example, if the improvement is only noticeable with very large examples), tehre should still be small examples to illustrate these parameters, with a comment about them being too small to notice, just to show how they are used.
And the technical parameters partition_size, memory_size should be listed in the INPUT block rather than in a separate note.
I have left this as "needs work" but those things are very small and a positive review is not far away!
comment:31 Changed 13 years ago by
- Description modified (diff)
- Summary changed from [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm
comment:32 Changed 13 years ago by
I renamed the function to "discrete_log_rho", removed the unnecessary parameters (partition_size and memory_size don't change that much the behavior), added an example of user set hash function, and forgot to add an exemple for check_prime... I will fix this in 2 minutes!
comment:33 Changed 13 years ago by
I finally choose to remove the parameter check_prime because the complexity of Pollard Rho algorithm is a lot bigger than primality testing.
comment:34 Changed 13 years ago by
- Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to Pollard rho algorithm for generic discrete logarithm
comment:35 Changed 13 years ago by
- Report Upstream set to N/A
- Reviewers set to Martin Raum
- Status changed from needs_review to positive_review
This gets a positive review from me with the small changes in the review patch. All tests pass. For transparency I just added one tiny comment on the fall back and completed the type of a raised error.
Changed 13 years ago by
comment:36 Changed 13 years ago by
- Merged in set to sage-4.3.rc1
- Resolution set to fixed
- Status changed from positive_review to closed
comment:37 Changed 13 years ago by
- Milestone changed from sage-4.3.1 to sage-4.3
Do you know what makes this slower?
I'm not so familiar with this version of Pollard, but this looks well-written. I'll test it once my build of 3.3.alpha2 has finished.