Opened 13 years ago

Closed 13 years ago

Last modified 13 years ago

#5098 closed enhancement (fixed)

Pollard rho algorithm for generic discrete logarithm

Reported by: ylchapuy Owned by: tbd
Priority: minor Milestone: sage-4.3
Component: algebra Keywords:
Cc: cremona Merged in: sage-4.3.rc1
Authors: Yann Laigle-Chapuy Reviewers: Martin Raum
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by ylchapuy)

First attempt to provide an algorithm using less memory than the "baby step giant step" algorithm for generic discrete logarithm.

This algorithm uses only (small) constant memory.

Proposed patch attached, with doctests.

Attachments (4)

trac-5098-alpha2based.patch (8.8 KB) - added by ylchapuy 13 years ago.
trac-5098-doctest.patch (3.4 KB) - added by ylchapuy 13 years ago.
trac_5098_standalone.patch (5.9 KB) - added by ylchapuy 13 years ago.
based on sage 4.1.1
trac-5098-pollard_review.patch (5.7 KB) - added by mraum 13 years ago.

Download all attachments as: .zip

Change History (41)

comment:1 Changed 13 years ago by cremona

Do you know what makes this slower?

I'm not so familiar with this version of Pollard, but this looks well-written. I'll test it once my build of 3.3.alpha2 has finished.

comment:2 follow-up: Changed 13 years ago by ylchapuy

I don't really know what makes it slow. I suspect the use of "hash" is costly but I see no other way to partition generically a group in approximately equal size chunks.

comment:3 in reply to: ↑ 2 ; follow-up: Changed 13 years ago by cremona

Replying to ylchapuy:

I don't really know what makes it slow. I suspect the use of "hash" is costly but I see no other way to partition generically a group in approximately equal size chunks.

That is probably right. It might be worth putting in a comment (or a docstring sentence aimed at future developers) saying that the efficiency of the functions depends on having a fast hash.

comment:4 in reply to: ↑ 3 Changed 13 years ago by ylchapuy

Replying to cremona:

Replying to ylchapuy:

I don't really know what makes it slow. I suspect the use of "hash" is costly but I see no other way to partition generically a group in approximately equal size chunks.

That is probably right. It might be worth putting in a comment (or a docstring sentence aimed at future developers) saying that the efficiency of the functions depends on having a fast hash.

patch modified to adress the above comment. The hash function is now a parameter of the pollard_rho function to give future developers an easy way to customize it.

comment:5 Changed 13 years ago by cremona

  • Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm

I just wrote a review for this but trac refused to save it since another comment has been made in between. How can it do this!!!! Luckily I was able to use the "back" button to revover what I had written.

Review:

  • Patch applied fine to 3.3.alpha2. Doctests in groups/generic pass.
  • In the docstring for pollard_rho() I would move the explanation for the partition_size parameter to a NOTE later, not in the description of the paramater; and that description should say that the default is 16.
  • There should be doctests in discrete_log() showing the use of the algorithm parameter.

I also think that it would be helpful to add to this ticket (and perhaps also in doctests) examples showing how the new algorithm is sometimes preferable!

None of these is at all serious, and think this additoin is basically very good.

comment:6 Changed 13 years ago by ylchapuy

  • Summary changed from [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm

a patch is on the way. I also removed to debug printings I forgot and added my authorship.

Concerning comparison, the benefit of Pollard rho algorithm comes only with quite big examples and I don't think it's a good idea for doctests. Here's one example.

----------------------------------------------------------------------
| Sage Version 3.2.3, Release Date: 2009-01-05                       |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
Loading Sage library. Current Mercurial branch is: yann
sage: get_memory_usage()
113.9765625
sage: F.<a>=GF(2^118)
sage: g=F.multiplicative_generator()
sage: u=g^15726718062461913153073925660344578
sage: time discrete_log(u,g,algorithm="rho")
CPU times: user 56.06 s, sys: 0.20 s, total: 56.26 s
Wall time: 56.39 s
15726718062461913153073925660344578
sage: get_memory_usage()
116.03515625
sage: time discrete_log(u,g,algorithm="bsgs")
CPU times: user 46.74 s, sys: 0.54 s, total: 47.29 s
Wall time: 47.47 s
15726718062461913153073925660344578
sage: get_memory_usage()
392.89453125

actually, Pollard rho is not so slow :)

comment:7 Changed 13 years ago by ylchapuy

argh, sorry, trac-5098-review.2.patch are the same trac-5098-review.patch, only one of them needs to be applied

comment:8 Changed 13 years ago by ylchapuy

if someone reviews this patch, he might have a look at #5112 implementing Pollard lambda algorithm

comment:9 Changed 13 years ago by cremona

  • Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with review, a little more work] Pollard rho algorithm for generic discrete logarithm

I am having trouble applying the new patch. I have a 3.3.alpha2 build, so #5088 is already merged. I apply trac-5098.patch ok. Then I apply trac-5098-review.patch but get several hunks failing.

This might just be me being stupid, so it would be good if someone else tried.

While I am here though, you did not quite understand my third point in the original review: I did not mean that a new doctest should illustrate the greater efficiency (or not) of the new algorithm, but it should at least illustrate how to use the new parameter. e.g. take one of the existing examples and compute it bothe ways and check that they are equal:

        sage: F.<a>=GF(2^63)
        sage: g=F.gen()
        sage: u=g**123456789
        sage: discrete_log(u,g,algorithm='bsgs') == discrete_log(u,g,algorithm='rho')
        True

comment:10 Changed 13 years ago by ylchapuy

  • Summary changed from [with patch, with review, a little more work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm

brand new standalone patch, based on a fresh alpha2 build. I hope everything is ok this time :)

Changed 13 years ago by ylchapuy

comment:11 Changed 13 years ago by cremona

  • Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm

Thanks -- I am quite happy now, and give this a positive review! Now I will go and look at the lambda patch too.

comment:12 Changed 13 years ago by mabshoff

  • Summary changed from [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review, needs doctest fix] Pollard rho algorithm for generic discrete logarithm

The latest patch causes the following doctest failure: From

File "/scratch/mabshoff/sage-3.3.alpha4/devel/sage/sage/rings/integer_mod.pyx", line 444:
   sage: sage.groups.generic.discrete_log(a,b)
Expected:
   Traceback (most recent call last):
   ...
   ZeroDivisionError: Inverse does not exist.

we get to

ArithmeticError: multiplicative order of 4 not defined since it is not a unit modulo 100

William doesn't like either one, but he is about to comment himself.

Cheers,

Michael

comment:13 Changed 13 years ago by mabshoff

 William:  Hmm.
Better would be a statement that the discrete log doesn't exist.
Or that it might but the algorithm implemented can't tell.
I don't like either error.
 me:  Well, I will comment on the ticket and then you can comment there on the record.
 William:  The log does exist.
so it would be much better to get a NotImplementedError when things go wrong.

Cheers,

Michael

comment:14 Changed 13 years ago by mabshoff

Hi guys,

this ticket and #5112 is being held up by the above doctesting issues. The patches should go into 3.3 and rc0 will drop in roughly 48 hours, so there is time to sort this out. We could do two things:

  • change the doctest to "fix" the problem for now and open a followup ticket to deal with the problem William pointed out. Since the situation doesn't seem to get worst that seems acceptable
  • fix the problem William pointed out :)

Thoughts?

Cheers,

Michael

comment:15 Changed 13 years ago by cremona

I don't have time to look into this and hope that ylchapuy will. I don't actually think it is the Pollard rho code at fault since by default generic discrete log still uses bsgs, so it is more likely to be the previous (and already merged) improvement to the discrete log which ylchapuy added a week or two ago.

If I finish all the "real" work I must do before tomorrow I'll come back to this but that is not very likely...

comment:16 Changed 13 years ago by ylchapuy

I would suggest to fix the doctest for the moment, and open a ticket for an enhancement. For the moment, we only allow discrete logarithms computations in groups with known order; I think it could be possible quite easily to make pollard rho algorithm to work for unknown order too but it needs some work. I definitely have no time today, and no clean install of recent sage to patch the doctest myself ontime...

Changed 13 years ago by ylchapuy

comment:17 Changed 13 years ago by ylchapuy

added a small patch to fix doctest, and return a NotImplementedError? explaining the problem which is that we can't get the order of the base (in the example it's not a unit).

comment:18 Changed 13 years ago by ylchapuy

  • Summary changed from [with patch, with positive review, needs doctest fix] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm

comment:19 Changed 13 years ago by ylchapuy

  • Milestone changed from sage-3.4.1 to sage-3.3
  • Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm

the last patch only change the behaviour in case the order of the base can't be computed. If all the patches apply cleanly I think it doesn't need another review (correct me if I'm wrong).

comment:20 Changed 13 years ago by ylchapuy

  • Summary changed from [with patch, with positive review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review, doctest fix needs review] Pollard rho algorithm for generic discrete logarithm

comment:21 Changed 13 years ago by cremona

I was intending to review this (again), but there are a lot of patches and I do not know which ones to apply. All?

Regarding the last doctest issue, the function being tested states in its docstring that the arguments should be units, so surely it would be more sensible to test that and return an error if not, rather than go ahead and call any of the generic discrete log functions (which are placed in sage/GROUPS/generic.py for a reason: they require a group and were written that way, so inverses would exist). Just because 4 happens to be a power of 2 mod 100 does not mean that I should be forced to apologise for not returning an answer in that case. I know of no reason anyone might ever have for implementing discrete logs for non-units in a non-integral domain.

comment:22 follow-up: Changed 13 years ago by ylchapuy

First: The patches to apply are the last two only; is there a way to remove the other ones?

Secondly: I do agree, discrete logarithm is intended only for groups. The changes in my last patch might still be useful. I test if we can compute the order of the base, because the algorithms we use for now need them. It's probably not a good idea though to treat in the same way a non unit, and an element of a group where order is not implemented...

What should I do to make those patches the right way ?

comment:23 in reply to: ↑ 22 Changed 13 years ago by mabshoff

Replying to ylchapuy:

First: The patches to apply are the last two only; is there a way to remove the other ones?

Right now only trac admins can delete patches. I will delete all but the two last patches shortly.

Some times it is interesting for credit and historic purposes to keep old patches around, but in this case it should be no problem to delete them.

Cheers,

Michael

comment:24 Changed 13 years ago by was

  • Summary changed from [with patch, with positive review, doctest fix needs review] Pollard rho algorithm for generic discrete logarithm to [with patch, with positive review, doctest fix needs work] Pollard rho algorithm for generic discrete logarithm

I have some issues with trac-5098-doctest.patch:

  1. There should be a doctest that illustrates every single exception that can be raised. For example this patch adds
    raise NotImplementedError, "The current implementation of Pollard Rho algorithm needs to have (a multiple of) the order of the base" 
    

on line 543, but there is no doctest that illustrates this.

  1. Line 758, The %s"%base business is bad.
    raise NotImplementedError, "we currently need to know the order of the base %s"%(base) }}}
    I used to write a lot of exceptions like this, since for Magma we did that.  But in Magma, they aren't exceptions that can be caught -- any single exception stops the program.  In Sage, exceptions can be caught and occur as part of the normal flow of execution of a program.  Once in 2006 there was a simple calculation David Harvey was doing, and he was very confused because it was taking several minutes to do something trivial.  It turned out this was because in the course of the computation an exception was raised and caught (e.g., because of some coercion model stuff), and the exception was written like the one before -- it was a string that printed the element involved in the coercion, and the str(...) conversion of the element took *minutes*.   That was what dominated the calculation.  As a result I realized that I had completely messed up in using that pattern for implementing exceptions in Sage, and had to go through the whole codebase and rewrite all such exceptions. 
    
    Just to hammer this home, consider 
    {{{
    sage: a = Mod(16, 100); b = Mod(-4,10^10^7)
    sage: try: w = sage.groups.generic.discrete_log(a,b)
          except: pass
    [hang forever!]
    }}}
    and if it ever did come back, imagine what that exception would look like!
    
    

comment:25 Changed 13 years ago by was

Here's some irc chatter about the "%s"%base exception business. There are perhaps thousands (?) of instances of this in Sage now.

23:46 < wstein> They can be "silent killers" :-)
23:46 < mabs> wstein: can an opinion on the broken combinat pickles?
23:47 < mabs> can -> got
23:48 < wstein> wstein@sage:/space/wstein/sage-3.3.rc0$ ./sage -grep "raise" "%s" |wc -l
23:48 < wstein> 6750
23:48 < mabs> *OUCH*
23:48 < wstein> It's not always bad.
23:48 < wstein> wstein@sage:/space/wstein/sage-3.3.rc0$ ./sage -grep "raise" "%s" |wc -l
23:48 < wstein> Oops.
23:49 < wstein> sage: search_src("raise","%s") # also shows them.
23:49 < wstein> It's only the ones that have elements that could easily be large in the exception that
23:49 < wstein> are a danger.
23:49 < wstein> But a quick glance suggests there are probably thousands of these still.
23:50 < mabs> Yes, I thought the comment was good since I never knew about that problem.
23:50 < wstein> Especially in the combinat and coding theory code...
23:50 < mabs> :)
23:50 < wstein> One can just do search_src("raise","%s")
23:50 < wstein> look to see which exceptions are funny, and quickly craft examples where
23:50 < wstein> things seem to hang for stupid reasons instead of raising an exception.
23:51 < wstein> Where hang = create a base-10 string rep for something in memory, then try to display a bazillion characters to screen.

comment:26 Changed 13 years ago by was

A comment by cwitty about my suggestion that every exception should be doctested:

00:07 < cwitty> wstein: re your comment on #5098: I've added exceptions to Sage that I have no idea how 
                to doctest.
00:07 < wstein> cwitty -- for that patch I think they would all be easy.
00:07 < wstein> I had that in mind.
00:07 < cwitty> I sort of suspect that hitting them may require working in a number field of degree 
                several billion, but I'm not positive.
00:07 < wstein> cwitty -- at least you can give a reason for not doctesting the ones you don't test.
00:08 < wstein> I don't mean for that to be some absolute rule.
00:08 < wstein> Just a general guideline.
00:08 < wstein> But you have a good point.
00:08 < wstein> I'll paste this to the ticket.
00:09 < wstein> The last thing I wrote in sage was something involving gap.py and the "This should 
                never happen" exception.
00:09 < wstein> That can't be doctested :-)
00:09 < wstein> So you're right.

comment:27 Changed 13 years ago by mabshoff

  • Milestone changed from sage-3.3 to sage-3.4.1

I am cleaning up the 3.3 milestone. If a patch with positive review is put up to this ticket on time it might make it into 3.3.

Cheers,

Michael

comment:28 Changed 13 years ago by mabshoff

  • Summary changed from [with patch, with positive review, doctest fix needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs work] Pollard rho algorithm for generic discrete logarithm

Any movement on this?

Cheers,

Michael

comment:29 Changed 13 years ago by ylchapuy

  • Cc cremona added

Let's try another way. The last patch I uploaded is to be applied instead of the previous ones. It only provide a Pollard rho algorithm for discrete log, without plugin it in the existing discrete log. We can address the other problems about non unit elements for example in another ticket.

comment:30 Changed 13 years ago by cremona

  • Summary changed from [with patch, needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm

This looks ok to me. It applies fine to 4.1.alpha3 and tests pass. It might be better to call the function discrete_log_rho instead of pollard_rho_log as that would make it easier to find via tab completion on discrete_*.

There should also be examples (doctests) to illustrate the use of the parameters partition_size, memory_size, hash_function, check_prime. For example, one where using a non-default value gives better (i.e. faster) results. If it is not practical to put those into doctests (for example, if the improvement is only noticeable with very large examples), tehre should still be small examples to illustrate these parameters, with a comment about them being too small to notice, just to show how they are used.

And the technical parameters partition_size, memory_size should be listed in the INPUT block rather than in a separate note.

I have left this as "needs work" but those things are very small and a positive review is not far away!

comment:31 Changed 13 years ago by ylchapuy

  • Description modified (diff)
  • Summary changed from [with patch, with review, needs work] Pollard rho algorithm for generic discrete logarithm to [with patch, needs review] Pollard rho algorithm for generic discrete logarithm

comment:32 Changed 13 years ago by ylchapuy

I renamed the function to "discrete_log_rho", removed the unnecessary parameters (partition_size and memory_size don't change that much the behavior), added an example of user set hash function, and forgot to add an exemple for check_prime... I will fix this in 2 minutes!

Changed 13 years ago by ylchapuy

based on sage 4.1.1

comment:33 Changed 13 years ago by ylchapuy

I finally choose to remove the parameter check_prime because the complexity of Pollard Rho algorithm is a lot bigger than primality testing.

comment:34 Changed 13 years ago by AlexGhitza

  • Summary changed from [with patch, needs review] Pollard rho algorithm for generic discrete logarithm to Pollard rho algorithm for generic discrete logarithm

comment:35 Changed 13 years ago by mraum

  • Authors set to Yann Laigle-Chapuy
  • Report Upstream set to N/A
  • Reviewers set to Martin Raum
  • Status changed from needs_review to positive_review

This gets a positive review from me with the small changes in the review patch. All tests pass. For transparency I just added one tiny comment on the fall back and completed the type of a raised error.

Changed 13 years ago by mraum

comment:36 Changed 13 years ago by mhansen

  • Merged in set to sage-4.3.rc1
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:37 Changed 13 years ago by mhansen

  • Milestone changed from sage-4.3.1 to sage-4.3
Note: See TracTickets for help on using tickets.