The Graph Of Ideas 2.0

20 Jul

First of all, thanks very much for the feedback you all gave me on my graph of ideas. I wasn’t quite aware of how many people are interested in this sort of stuff. I now have lots of great ideas for new projects which will keep me busy for a long while. I must say making the graphs is the easy part – it is obtaining the data which takes time. I’ve made a note of all of your suggestions and will try to create something out of them soon. If you haven’t already, you can submit an idea here. I read them all.

Housekeeping

There were a great number of comments about my last graph and so I’ll try to answer the main questions here. I think many of them were from people who hadn’t actually read the post at all but went straight to the graphic with whatever catch line someone shared along with it. It was reposted at Gizmodo, Spiegel.deBusiness Insider and FlowingData.com all of whom omitted the very important caveats I listed in the post. Please read the original post for the full discussion. These were a few of the common themes to the criticisms I saw floating around the interwebs:

“It is way too biased towards Western ideas.”
– 
Yes, see point one of the original blog post. I simply plotted what Wikipedia (dbpedia) gave me – of course it is biased, like any dataset and this was stated.

“Where are all of the musicians and artists?”
– See original blog post. Artists don’t have the available information in the Wiki info-boxes (well except some, see bottom left, green part). I hope to make a musician/artist graph soon!

“The title is very misleading.”
– The original post had an asterisk on the word ‘every’ which was meant to highlight the fact the graph had caveats. I didn’t anticipate people leaving this out when they shared it with their friends. I changed it to be simply ‘The Graph Of Ideas’. I’ll be more careful in future.

Now that is out-of-the-way, I’d like to present my latest work. I’ve broken this post down into two sections: the network and the method. In order to fully understand the network, I suggest you also read the method. If you have better things to be doing with your life, quickly check out the plot below, glance at the caveats and then move on – I’ll catch up with you soon enough.

Network: Graph Of Ideas vs. Graph Of Ideas 2.0

The first graph connected people via a single connection. That is to say, if Socrates influenced Plato and Plato influenced Aristotle then the following connections were made:

Socrates –> Plato –> Aristotle

Easy, right?

However, as I briefly mentioned in the previous post, each individual in time represents the sum of their ancestors. This means that Socrates should technically be linked to Aristotle too! Whether we like to think about it or not — Socrates’ contribution to our body of understanding of the world is embodied in the way we speak and interact on a daily basis. Sure there is some dilution, but Socrates’ philosophies are for better or for worse, buried deep within you somewhere . This isn’t just true for Socrates either – it is true for everyone who has ever existed.

On the September 5th, 1948,  Jiddu Krishnamurti gave a public talk in Poona, India. It it he stated:

“You and I are not isolated; we are the result of the total process, the outcome of the whole human struggle, whether we live in India, Japan, or America. The sum total of humanity is you and me. Either we are conscious of that, or we are unconscious of it.”

Now click here.

Welcome back… so with all of this in mind I went ahead and made a little program which calculates these upstream connections which were missing in the first graph: hence the `2.0′.

Here is the resulting graph (~20% of the total ~4,200 nodes available):

The Graph Of Ideas 2.0 (connections not visible). Click here for dynamic zoom.

The most connected names cluster together.

Traditionally less prominent historical figures become larger.

The biggest/smallest names of the first Graph Of Ideas are now smaller/larger.

If you would like to see the full graph with the underlying connections click here. It is ~50MB so put the kettle on. BUT, if you would like to see it with an easy to use zoom (scroll) function click here (recommended).  It naturally is quite messy and difficult to read and I wanted it to be this way: it shows a more honest picture of how people are connected through history.

I must apologise for the overlapping names in some places. Due to the number of background nodes, label adjust (used to make non-overlapping names) seemed to crash every time I tried to run it. You can still make out almost all of the names however. I’ll try and upload better version soon.

Your first immediate reaction might be “riiiight”. Then after a careful examination you might see how dominated the graph is by people who died a long, long time ago. This is because the graph is biased toward the oldest generation of thinkers. Take for instance the biggest group – the Greeks. They influenced a great many people who in turn influenced a whole bunch more. Unlike the previous graph, these 3rd generation thinkers are now connected to the 1st generation thinkers thus amplifying their overall size and connectedness within the network. For example Socrates, Plato and Aristotle are now connected to every person Nietzsche was connected to in the previous graph.

I also suspect there are a great number of people you might not have heard of who have large nodes. For me, this is a quick way to find interesting people to read about on Wikipedia. Lastly, many people who were tiny in the previous graph are now quite large e.g. Confucius, Socrates etc.

I compiled a list of the most connected people.

Name  Connections Nationality Born (B.C.)
Thales 3390 Greek 624
Pythagoras 3386 Greek 570
Zeno of Elea 3378 Greek 490
Socrates 3376 Greek 469
Parmenides 3368 Greek 5th cent.
Protagoras 3352 Greek 490
Plato 3351 Greek 423
Melissus of Samos 3332 Greek 5th cent.
Leucippus 3329 Greek 5th cent.
Zeno of Citium 3306 Greek 334
Pyrrho 3306 Greek 360
Stilpo 3300 Greek 360
Posidonius 3288 Greek 135
Panaetius 3286 Greek 185
Lucretius 3275 Roman 99

Bertrand Russell in his History of Western Philosophy (1945) wrote “Western Philosophy begins With Thales”. As we can see, his claim is backed up by this graph. Thales is the most connected individual with 3390 connections. This doesn’t mean he is the most influential or humanity’s biggest asset – it just means that if the data was complete (which it isn’t), then his ideas have influenced (in whatever arbitrary way you define it) the most number of people.

The margin separating top 5 is also quite small. This is presumably because there are only one or two degrees of separation connecting them all. Interestingly there is only one person not of Greek origin in the entire top 10. Again, this is largely due to the incompleteness of the dataset – these gentlemen also have antecedents which are either a) not entered into Wikipedia or b) have been lost in history. As one Redditor wrote: “the group that came up with fire should be in the middle and bigger than everything combined”.

This type of graph just shows how much our perceptions of ideas can change depending on how we present information. This is my take home message for today.

Those in a rush — scroll to the caveats!

The Method:

You might at first think this is quite a trivial problem to solve but it did require some careful thought and even more careful programming. I mean in plain English, you just want to connect person A to whoever they have a forward connection with elsewhere on the network – how hard can it be?! Let me explain. Put your scuba suit on… now.

Here we have a basic (less pretty) graph of influences between a few people:

A basic influence network.

For example, here we have A influencing B and B influencing E. The crucial point is that A does not influence E because there is no direct connection between the two (this was the case in the previous graph). The problem in trying to reconnect A with E is that you have to find a way to traverse the connections in the correct direction and ensure you don’t end up in an infinite loop (A influences B,  B influences E and E influences A!). Basically I want to convert a graph like this:

A,B
A,C
B,E
B,D
C,F
C,G
D,B
E,A
F,J
H,E
J,I

To a graph like this:

A,B
A,C
A,E
A,D
A,F
A,G
etc.

If you have the time, spend a few minutes trying to think of a way to make a list of unique names and every person they are connected to through someone else. Over my lunch break today, a friend of mine and I came up with a way to do this quite quickly. Here is the page we scratched on during lunch:

Planning. Chickens were here.

As you can see, we tossed around nested do and while loops but they were just too complicated – the solution matrices.  There are a whole host of ugly problems you encounter if you try solve this using nested do and while loops. All this means is that my connection map will look something like this:

#,A,B,C
A,0,1,0
B,1,0,1
C,1,0,1

The number 1 represents a connection between the row and corresponding column e.g. A is connected to B, B is connected A and C and C is connected to only A. I wrote a script in Matlab which calculates just this and generates a new list of new connections. Essentially this works by looping through the rows and checking if there is a connection (=1). If there is a connection, it then finds where the person they are connected to in the same group of rows. Once they are found, their entire row is added to the original person’s row. . If you’re a bit of a coding oracle please let me know if there are faster ways of achieving the same result. So for our matrix above, C influenced A but A influenced B so C should also influence B, right? This algorithm turns the above matrix into this (just for the looping component on 3rd row:

#,A,B,C
A,0,1,0
B,1,0,1
C,1,1,1

Specifically, row C is added to row A and the dot product of the two is subtracted. This ensures there is always either a ’1′ or a ’0′ in all cells. The algorithm loops over every row in the matrix and carries out this procedure.

The original list had 14,560 connections so it took a reasonable while to do all of the permutations on my laptop (10 minutes). This new list has 4,239 nodes with over ~830,000 connections. Last time I checked, 830,000 > 14,500 so there is a lot more connectivity going on in this graph than the previous one. My code can also contain self-references. This is because two people may be contemporaries of one another and influence one another. If I influence my brother and he influences me, do I not have a slight influence myself through the actions of my brother? I thought I would leave these in just to see where they would turn up. No harm done here.

Once you make the matrix there are a whole heap of interesting things you can do. For example, who has the most connections in the network? Well, you just sum the row of each matrix and sort it in descending order (shown at the start). The last part of my script does this for you. I understand many of you won’t have Matlab and so I apologise in advance for this in advance. I might try to do it in Python next time. In the mean time, you could try a free trial version or use Octave which is a free version of matlab.

For our example above, once you create the matrix, all you need to do now is simply create a .dl file which contains the following:

dl n=3
format = fullmatrix
labels:
Person A, Person B, Person C
data:
0 1 0
0 0 1
1 0 1

This is the information which helped me. Once I obtained this matrix for the Wikipedia network, Gephi was able to import it. Thank-you to whoever made this extension – it is genius.

Sorry to waffle on a bit but last time I had a number of people requesting more detail on how I go about making the graphs. Finally, to save you going back through the text, here are links to the data I used to create my map. I’ve compressed some of them but at most, they will expand to about 100MB.

All the data:

  1. You’ll need Gephi (free) and Matlab (or Octave).
  2. Original list of people and their influences from dbpedia.
  3. The Matlab script which generated the linking matrix.
  4. The list of names used in the network.
  5. The csv matrix of 1′s and 0′s only.
  6. The final .dl file require to import into Gephi.

Caveats

  1. Many important people have been left out of the network. I am limited by the information provided by dbpedia. I mean I had to cut 80% of the network I had available just to make the plot I showed here!
  2. The communities are coloured by the Modularity module in Gephi – I do not personally colour anything.
  3. The graph is biased towards Western ideologies. The graph is biased towards Western ideologies. Yes 2x.

That’s all for now. Let me know in the comments section if you have any questions.

20 Responses to “The Graph Of Ideas 2.0”

  1. Finn0123 July 20, 2012 at 7:33 am #

    If I may make a suggestion, a software alterantive with little to no work on your part may be Octave (found here:http://www.gnu.org/software/octave/) which is a freeware version of MatLab.

    Naturally, as I have not tried this I don’t know if it will work, but it’s probably worth a shot to those who don’t have MatLab.

    • Griff July 20, 2012 at 12:42 pm #

      Ah yes of course – I completely forgot about Octave. Thanks for the reminder Finn. I’ve put that in as an option.

      • Pascal Wallisch July 20, 2012 at 2:46 pm #

        Have you considered graphing the neuroscience community?
        http://neurotree.org/neurotree/

      • Griff July 20, 2012 at 2:49 pm #

        Ah looks great but family trees aren’t so good for this sort of thing. More inter-related networks are much more insightful I think. I’ll have a look around though! Thanks.

  2. Pascal Wallisch July 20, 2012 at 2:54 pm #

    The neuro community is highly inbred, so it might work. Also, most people like collaborators, so there are horizontal connections galore.

    • Griff July 20, 2012 at 2:56 pm #

      Ha! I’ll take a look. More interesting are citation networks which reveal some pretty cool stuff – the hardest part is obtaining the data though.

      • slowXtal August 2, 2012 at 9:24 pm #

        useful (too small maybe ?) sources for citation datasets (and some other) :

        http://arnetminer.org/citation
        http://snap.stanford.edu/data/#citnets

      • Griff August 3, 2012 at 1:51 am #

        Thanks for the suggestion! Yes, citation networks within academia is on the list of things to examine. It is just a case of getting the data in the right format!

  3. Camilo Baez August 17, 2012 at 9:12 am #

    Hi Mr. Graph Maker, I found that both graphs of ideas are a giant “almost” infinite* web of lines around names that doesn’t add much value to the viewer, what might be very valuable is to be able to look each name separately so we can appreciate clearly which authors/thinkers influenced that particular author/thinker and conversely which authors/thinkers have he influenced. Tough I think for this purpose the version 1.0 is better suited.
    Best of lucks!
    Camilo

    • Griff August 17, 2012 at 8:02 pm #

      Hi, thanks for your comment. Yes, I agree being able to integrate the maps with Gephi online would be ideal. You could then traverse the graph yourself and look at people of personal interest. In future perhaps. I’ll see what I can do.

      • Visar March 5, 2013 at 11:54 am #

        Griff, Do you think you can help me with a complex network I am trying to bring together. I will even pay you. please contact me at visarsenal@gmail.com

  4. Max Levental July 7, 2013 at 5:25 am #

    Hey you’re download links are dead? Is there anyway you could repost the Gephi file? I’d like to play it with it myself. Also the Gephi file for version 1.0, where people are just syllogistically connected, instead of directly. Thanks

    • Max Levental July 7, 2013 at 5:28 am #

      Scratch that I found the files in the first post.

Trackbacks/Pingbacks

  1. The Graph Of Ideas « Griff's Graphs - July 20, 2012

    [...] I have also created a graph which includes upstream connections between thinkers. Check it out here. Share: this:TwitterFacebookTumblrStumbleUponRedditPinterestLinkedInDiggLike this:Like16 bloggers [...]

  2. The Web of Knowledge - July 26, 2012

    [...] Also, check out Griff’s work showing the upstream connections. [...]

  3. The Graphs Of Football, Basketball, Ice Hockey, Baseball and Soccer « Griff's Graphs - July 26, 2012

    [...] I decided to approach this one using matrices. Please see my post on Wikipedia personalities to see how the datasets are preprocessed. I did however have to design a [...]

  4. “Originality is nothing but judicious plagiarism”*… « (Roughly) Daily - August 4, 2012

    [...] Readers seemed to enjoy Simon Raper’s diagrammatic history of philosophy (see “Who’s Hume“), so may also appreciate Brendan Griffen‘s even more ambitious visual essay– a depiction of the connections between every important thinker, ever: “The Graph of Ideas.” [...]

  5. The influence of ideas | 1000heads: The Word of Mouth People - August 6, 2012

    [...] western culture and the fuzziness of the very concept of influence – and has even created a second version which attempts to trace the ‘upstream’ journey of influence to an even deeper degree. [...]

  6. Wikimedia Research Newsletter, August 2012 — Wikimedia blog - August 30, 2012

    [...] Brendan Griffen: The Graph Of Ideas 2.0. Griff’s Graphs, July 20, [...]

  7. Visualização sobre o universo das ideias | Daniela Kutschat - September 1, 2012

    [...] se dedica ao aperfeiçoamento do trabalho, e já publicou um nova versão The Graph Of Ideas 2.0. O designer abriu um espaço de discussão no seu blog, em que reconhece um certo nível de [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 70 other followers

%d bloggers like this: