Opened 6 years ago
Closed 6 years ago
#17385 closed enhancement (fixed)
Cleanup Graph.__init__ and DiGraph.__init__
Reported by: | vdelecroix | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-6.5 |
Component: | graph theory | Keywords: | |
Cc: | ncohen | Merged in: | |
Authors: | Vincent Delecroix | Reviewers: | David Coudert, Travis Scrimshaw, Nathann Cohen |
Report Upstream: | N/A | Work issues: | |
Branch: | 0e8d1c4 (Commits) | Commit: | 0e8d1c42c6fc12f59a25359059f1a5025ece20c3 |
Dependencies: | #17384 | Stopgaps: |
Description
While reading #17384, I thought that a global cleanup of Graph.__init__
and DiGraph.__init__
would be good. We implement:
1)
- data = dict([u, list(data[u])] for u in data) + data = {u: list(v) for u,v in data.iteritems()}
2) call to uniq
is bad since it calls a useless sorted
: each appearance of {{{len(uniq(X))}}}
is replaced with {{{len(set(X))}}}
.
3) dict.iteritems
is more readable and faster
sage: d = {randint(0,1000): [randint(0,1000) for _ in range(10)] for _ in range(100)} sage: timeit("for i in d: j = d[i]", number=2000) 2000 loops, best of 3: 9.75 µs per loop sage: timeit("for i,j in d.iteritems(): pass", number=2000) 2000 loops, best of 3: 7.3 µs per loop
4) avoid iterating through the elements of the adjacency matrix when possible
and more...
Change History (36)
comment:1 Changed 6 years ago by
- Branch set to u/vdelecroix/17385
- Status changed from new to needs_review
comment:2 Changed 6 years ago by
- Commit set to 3f8a98bc9caa9f64504cf2d85ce4ed39ea7ce12c
comment:3 Changed 6 years ago by
- Reviewers set to David Coudert
- Status changed from needs_review to positive_review
I was not expecting bool is False
to be faster than bool === False
. Amazing.
sage: bool = True sage: %timeit bool is False 10000000 loops, best of 3: 70.3 ns per loop sage: %timeit bool == False 10000000 loops, best of 3: 88.7 ns per loop sage: sage: bool = False sage: %timeit bool == False 10000000 loops, best of 3: 88.9 ns per loop sage: %timeit bool is False 10000000 loops, best of 3: 70.2 ns per loop
For me the patch is OK (installation, test, doc), so I give positive review.
comment:4 Changed 6 years ago by
Thanks for the review.
To answer your interrogation is
and ==
are very different, and is
will always be faster.
The command bool is False
is treated without asking anything to the object bool
. It is just the Python kernel that checks whether the memory addresses match or not. For ==
a method of the object bool
is called. Even if the comparison is implemented with something like return self is other
you do not avoid that function call.
But I am not sure that those 18 non seconds are of critical importance in the creation of a graph... it was more a matter of uniformization ;-)
Vincent
comment:5 follow-up: ↓ 6 Changed 6 years ago by
While that's an improvement, the fastest way is to do if x:
or if not x:
.
sage: x = True sage: %timeit if x is True: pass 10000000 loops, best of 3: 165 ns per loop sage: %timeit if x is False: pass 10000000 loops, best of 3: 148 ns per loop sage: %timeit if x: pass 10000000 loops, best of 3: 91.9 ns per loop sage: %timeit if not x: pass 10000000 loops, best of 3: 80.1 ns per loop sage: x = False sage: %timeit if x is True: pass 10000000 loops, best of 3: 148 ns per loop sage: %timeit if x is False: pass 10000000 loops, best of 3: 165 ns per loop sage: %timeit if x: pass 10000000 loops, best of 3: 79.1 ns per loop sage: %timeit if not x: pass 10000000 loops, best of 3: 91.9 ns per loop
So I'd change things to use that. Plus you can simplify things like:
+ if ((loops is None or loops is False) and - if (not loops and
(and if statements like that are more common).
comment:6 in reply to: ↑ 5 Changed 6 years ago by
- Status changed from positive_review to needs_work
Replying to tscrim:
While that's an improvement, the fastest way is to do
if x:
orif not x:
.
You are right... I will do the changes (but your timings are not complete since you forgot the case x = None
). There were occurrences at other places in the code. The advantage I see is that, while you read it, you know that None
is one possible choice whereas
if not x and some_condition: if x is False: raise ValueError x = True
is a bit less verbose.
Vincent
comment:7 Changed 6 years ago by
True. When I was doing my timings I wasn't thinking of the None
simplifications, but FTR:
sage: x = None sage: %timeit if x is None: pass 10000000 loops, best of 3: 117 ns per loop sage: %timeit if x is not None: pass 10000000 loops, best of 3: 102 ns per loop sage: %timeit if x: pass 10000000 loops, best of 3: 88 ns per loop sage: %timeit if not x: pass 10000000 loops, best of 3: 101 ns per loop
I would say the loss of verbosity (according to FF, I'm not making that word up like I thought) is covered by python ducktyping. Anyways, I don't really care that much, so if you don't think it should be changed, then feel free to leave it. Thanks for cleaning this up.
comment:8 Changed 6 years ago by
I agree that we should try to let the code as readable as possible, especially for non critical code.
If we want to optimize further, there is apparently a small difference between x is not None
and not x is None
;-)
sage: x = None sage: %timeit if x is not None: pass 10000000 loops, best of 3: 57.8 ns per loop sage: %timeit if not x is None: pass 10000000 loops, best of 3: 57.2 ns per loop
comment:9 Changed 6 years ago by
Hi,
Actually it was good to have another look at the code since I introduced a bug for weighted digraph.... (see the test in the last commit that will appear in a second).
@dcoudert: actually is not
is an operator that should be as fast as is
. While not
involves some function call... it is strange that you get those timings.
Vincent
comment:10 Changed 6 years ago by
- Commit changed from 3f8a98bc9caa9f64504cf2d85ce4ed39ea7ce12c to 2a5495dc5013a99a6dd5387ef4b7fd4727250e9a
Branch pushed to git repo; I updated commit sha1. New commits:
2a5495d | trac #17384: more cleanup after reviewer's comment
|
comment:11 Changed 6 years ago by
- Status changed from needs_work to needs_review
I am a bit worried about the code duplication. It was a pain to me to look systematically at both graph.py
and digraph.py
to search for potential mistakes. And most of it are very similar if not identical... if you have a magic solution for that.
comment:12 Changed 6 years ago by
- Commit changed from 2a5495dc5013a99a6dd5387ef4b7fd4727250e9a to 0fa9c20b5cf44989b40726e5c740c82a564a5263
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
0fa9c20 | trac #17385: more cleanup after reviewer's comment
|
comment:13 Changed 6 years ago by
(wrong commit message, sorry)
comment:14 follow-up: ↓ 15 Changed 6 years ago by
I don't understand why you write
if not loops and any(u in neighb for u,neighb in data.iteritems()): if loops is False:
Obviously, if the first test is True, then we have loops==False
. So you don't need the second one.
The same holds for if not weighted and any(...): if weighted is False:
and if not multiedges and any(...): if multiedges is False:
and ...
comment:15 in reply to: ↑ 14 Changed 6 years ago by
Replying to dcoudert:
I don't understand why you write
if not loops and any(u in neighb for u,neighb in data.iteritems()): if loops is False:Obviously, if the first test is True, then we have
loops==False
. So you don't need the second one.
Nope: loops
can be None
(or anything that does evaluate to False
such as 0
, []
, ()
, etc). Have a look at 6.
comment:16 Changed 6 years ago by
Yoooooooo !!
"While reading #17384, I thought that a global cleanup of Graph.init and DiGraph?.init would be good. We implement:"
Aahahaha. And do you want to know why I did not rewrite it all ? Because I thought that nobody would be willing to review the patch afterwards :-P
Thanks for your branch, I will try to review it today ;-)
Nathann
comment:17 follow-up: ↓ 18 Changed 6 years ago by
Already two reviewers on #17385... for a "nobody" it is a lot! But if you have improvements to share, please do.
Vincent
comment:18 in reply to: ↑ 17 Changed 6 years ago by
Yo !
Already two reviewers on #17385... for a "nobody" it is a lot! But if you have improvements to share, please do.
True true, I do not complain :-P
As to improvements, it was not really the goal: I would really like to make those functions cleaner by splitting them into subfunction, and perhaphs unify a bit better Graph/Digraph? is possible. Just a rewrite, but for a cleaner code.
Nathann
comment:19 Changed 6 years ago by
By the way there is also this thing that has been waiting for a year or so ... #15706.
Nathann
comment:20 Changed 6 years ago by
Hellooooooooo !
Well, I just finished reading the patch and it looks good, thanks for that ! One remark:
- Error on a loop: in the doctest you added, the error messages refers to a dictionary that the user never built. Also, "no loop but a loop" is not very informative.... What about 'The graph was built with loops=False but a loop was found in its data' or something ?
Nathann
comment:21 Changed 6 years ago by
- Status changed from needs_review to needs_info
comment:22 Changed 6 years ago by
- Commit changed from 0fa9c20b5cf44989b40726e5c740c82a564a5263 to ca661347a2dcd8c5d7e4720cc048028365d22092
Branch pushed to git repo; I updated commit sha1. New commits:
ca66134 | trac #17385: better error message
|
comment:23 Changed 6 years ago by
- Reviewers changed from David Coudert to David Coudert, Nathann Cohen
- Status changed from needs_info to needs_review
comment:24 Changed 6 years ago by
- Reviewers changed from David Coudert, Nathann Cohen to David Coudert, Travis Scrimshaw, Nathann Cohen
comment:25 follow-up: ↓ 26 Changed 6 years ago by
- Status changed from needs_review to needs_work
sage -t --long /home/ncohen/.Sage/src/doc/en/thematic_tutorials/lie/affine_finite_crystals.rst ********************************************************************** File "/home/ncohen/.Sage/src/doc/en/thematic_tutorials/lie/affine_finite_crystals.rst", line 359, in doc.en.thematic_tutorials.lie.affine_finite_crystals Failed example: G.is_isomorphic(Gdual, edge_labels = True, certify = True) Expected: (True, {[[1]]: [[-2]], [[2]]: [[-1]], [[-2]]: [[1]], [[-1]]: [[2]], []: [[0]]}) Got: (True, {[]: [[0]], [[1]]: [[-2]], [[2]]: [[-1]], [[-2]]: [[1]], [[-1]]: [[2]]})
comment:26 in reply to: ↑ 25 ; follow-up: ↓ 27 Changed 6 years ago by
Replying to ncohen:
sage -t --long /home/ncohen/.Sage/src/doc/en/thematic_tutorials/lie/affine_finite_crystals.rst ********************************************************************** File "/home/ncohen/.Sage/src/doc/en/thematic_tutorials/lie/affine_finite_crystals.rst", line 359, in doc.en.thematic_tutorials.lie.affine_finite_crystals Failed example: G.is_isomorphic(Gdual, edge_labels = True, certify = True) Expected: (True, {[[1]]: [[-2]], [[2]]: [[-1]], [[-2]]: [[1]], [[-1]]: [[2]], []: [[0]]}) Got: (True, {[]: [[0]], [[1]]: [[-2]], [[2]]: [[-1]], [[-2]]: [[1]], [[-1]]: [[2]]})
This has something to do with this ticket? Strictly speaking, those dictionary are the same... the thing is that the keys have no well defined ordering. The only reasonable thing I see is to remove this certify = True
in the test. What do you think?
Vincent
comment:27 in reply to: ↑ 26 Changed 6 years ago by
This has something to do with this ticket?
It seems. Does not happen otherwise O_o
Strictly speaking, those dictionary are the same... the thing is that the keys have no well defined ordering. The only reasonable thing I see is to remove this
certify = True
in the test. What do you think?
It is possible to test it with the_certificate == { the actual dictionary }
but it seems overkill. +1 for a certify=False
.
Nathann
comment:28 Changed 6 years ago by
- Commit changed from ca661347a2dcd8c5d7e4720cc048028365d22092 to 472f7e0148e2a1748fd79a03a03cd70946f087d4
Branch pushed to git repo; I updated commit sha1. New commits:
472f7e0 | trac #17385: fix doctest in affine finite crystals
|
comment:29 Changed 6 years ago by
- Status changed from needs_work to needs_review
comment:30 Changed 6 years ago by
- Status changed from needs_review to positive_review
Looks good, thanks !
Nathann
P.S.: You should do a "git pull" on the develop branch
comment:31 Changed 6 years ago by
- Status changed from positive_review to needs_work
Conflicts, probably #15706
comment:32 Changed 6 years ago by
- Commit changed from 472f7e0148e2a1748fd79a03a03cd70946f087d4 to a4257137090f2c8459137ba83e81cde2d099a996
Branch pushed to git repo; I updated commit sha1. New commits:
a425713 | Merge Sage-6.5.beta5
|
comment:33 Changed 6 years ago by
- Commit changed from a4257137090f2c8459137ba83e81cde2d099a996 to 0e8d1c42c6fc12f59a25359059f1a5025ece20c3
Branch pushed to git repo; I updated commit sha1. New commits:
0e8d1c4 | trac #17385: uniq -> set (remains from #15706)
|
comment:34 follow-up: ↓ 35 Changed 6 years ago by
- Status changed from needs_work to needs_review
Merged. I guess a last check-up would be fine.
Vincent
comment:35 in reply to: ↑ 34 Changed 6 years ago by
- Status changed from needs_review to positive_review
Merged. I guess a last check-up would be fine.
I ran a "make ptestlong" just because my office's computer was feeling useless. No problem, good to go ;-)
Nathann
comment:36 Changed 6 years ago by
- Branch changed from u/vdelecroix/17385 to 0e8d1c42c6fc12f59a25359059f1a5025ece20c3
- Resolution set to fixed
- Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. New commits:
trac #17384: Slowness when calling Graph(a_dictionary)
trac #17385: replacement uniq -> set
trac #17385: various optimization