space and games

August 17, 2006

The Summer of AI

Filed under: General — Peter de Blanc @ 4:12 am

I spent the last six weeks participating in the Summer of AI at the Singularity Institute for Artificial Intelligence, along with Nick Hay, Marcello Herreshoff, and Eliezer Yudkowsky. They’re all really nice guys, and Nick and Marcello were great roommates.

This summer left me with lots of ideas for problems to solve and things to write about (and books to buy, when I can afford it). Marcello and I designed a voting method, which I’ll write something about soon.

Michael Anissimov visited us one day, and took some pictures.

August 6, 2006

You Are Less Charismatic Than Your Friends

Filed under: Graphs — Peter de Blanc @ 10:33 pm

Note: I’m not sure if the explanations below are very clear. Feedback would be appreciated.

Let’s define one’s “charisma” as the number of friends one has. If Fred has 6 friends, than his charisma is 6; if Susan has 81 friends, then her charisma is 81.

Now if we select a person at random (let’s call ver Bijecto the Bland), and then randomly select one of vis friends (let’s call ver Functor the Magnificent), then we should expect at least as high a charisma for Functor as for Bijecto, simply as a result of the selection process.

Intuitively, one may reason as follows: Bijecto was selected at random, so we should expect ver to possess an average charisma. Someone with very many friends is more likely to be friends with any particular person, so the higher your charisma, the more likely you are to have Bijecto as a friend. Thus, if we select one of Bijecto’s friends at random (say, by looking at vis cell phone), then we are quite likely to select a charismatic person. We end up selecting Functor, and so we might guess that Functor is charismatic.

Confused? Let’s work an example.

Graph 1

Here’s a graph containing seven vertices. Let’s think of the vertices as representing people, and the edges as representing friendships. Thus, two vertices are “friends” if there is an edge connecting them.

Now let’s consider these two processes.

  • Process 1: select a random vertex, and look at its charisma
  • Process 2: select a random vertex, then select one of its friends at random, then look at its charisma

The theory is that the expected value of process 2 will be at least as high as that of process 1. Let’s see if it works out.

Process 1:

We can use a table to figure out the expected charisma of a random vertex. The first column lists each vertex. The second column indicates the probability that this vertex will be selected. Since each vertex is equally likely to be selected, this is 1/7 in all cases. The third column lists the charisma of each vertex.

The last column, labeled “Contribution,” lists how much each vertex contributes to the expected value of the process. This value is simply the product of the two previous columns. Thus, summing the “contribution” column will yield the expected value.

Vertex		Chance of Selecting	Charisma	Contribution
1		1/7			2		2/7
2		1/7			2		2/7
3		1/7			2		2/7
4		1/7			4		4/7
5		1/7			3		3/7
6		1/7			3		3/7
7		1/7			2		2/7	    
Total							18/7

So the expected value is 18/7. This works out to about 2.57. Now let’s see about…

Process 2:

This is a bit trickier. First we are selecting a random vertex (the “initial vertex”, vi), and then we are selecting one of its friends (the “final vertex”, vf). We are interested in the charisma of the final vertex. So we need to figure out the probability of each vertex being selected as the final vertex.

In other words, we wish to assess the probabilities p(vf = 1) … p(vf = 7). To do this we will need the product and sum rules of probability theory. Given two propositions A and B, the product rule states that p(A^B) = p(A) * p(B|A), where p(A^B) indicates the probability of A and B both being true (called the conjunction of A and B), p(A) is the probability of A, and p(B|A) is the probability of B given A (or B conditional on A).

Thus, for instance, p(vi = x ^ vf = y) = p(vi = x) * p(vf = y | vi = x). Using this rule, we can construct a table indicating the probability of each pair of initial and final vertices.

First, we need to determine the probability of each possible value for vi. These are the values p(vi = 1) … p(vi = 7). This is easy; they’re all 1/7.

Initial Vertex		Chance of Selecting
1			1/7
2			1/7
3			1/7
4			1/7
5			1/7
6			1/7
7			1/7

Now we need to determine the conditional probabilities: those of the form p(vf = y | vi = x). These will be 0 for anything which is not a friend of vi, and all friends of vi are equally likely to be selected as vf.

For instance, vertex 1 has two friends: 2 and 3. If vertex 1 is selected as vi, then vertices 2 and 3 each have a probability of 1/2 of being selected as vf. Here’s the complete table, along with the graph, for reference.

Graph 1
					vf
	vi	1	2	3	4	5	6	7  
	1  	0	1/2	1/2	0	0	0	0
	2	1/2	0	0	1/2	0	0	0
	3	1/2	0	0	1/2	0	0	0
	4	0	1/4	1/4	0	1/4	1/4	0
	5	0	0	0	1/3	0	1/3	1/3
	6	0	0	0	1/3	1/3	0	1/3
	7	0	0	0	0	1/2	1/2	0

It will now be easy to determine the probability of each conjunction, simply by multiplying the conditional probabilities by 1/7. Then, because vi = 1 … vi = 7 are mutually exclusive and exhaustive propositions, we can add up the values in each column to obtain the probability of each vertex being selected as vf.

				vf
vi	1	2	3	4	5	6	7   
1  	0	1/14	1/14	0	0	0	0
2	1/14	0	0	1/14	0	0	0
3	1/14	0	0	1/14	0	0	0
4	0	1/28	1/28	0	1/28	1/28	0
5	0	0	0	1/21	0	1/21	1/21
6	0	0	0	1/21	1/21	0	1/21
7	0	0	0	0	1/14	1/14	0   
Total:	1/7	3/28	3/28	5/21	13/84	13/84	2/21

Now we have a probability distribution for vf. Process 2 returns the charisma of vf, so we can compute the expected value of this process with one more table:

Vertex		Chance of Selecting	Charisma	Contribution
1		1/7			2		2/7
2		3/28			2		3/14
3		3/28			2		3/14
4		5/21			4		20/21
5		13/84			3		13/28
6		13/84			3		13/28
7		2/21			2		4/21	    
Total							39/14

So the expected value here is 39/14, which is about 2.79. The value for the first process was 2.57. This agrees with our hypothesis.

If I’m still not making sense, maybe another example will help:

Flower Graph

Process 1:

We select a random node and look at its charisma. We have a probability of 1/9 of selecting the central vertex, in which case the charisma will be 8. We have a probability of 8/9 of selecting one of the outer vertices, which will have a charisma of 1. Thus the expected charisma is (1/9)*8 + (8/9)*1 = 16/9.

Process 2:

We have a probability of 1/9 of selecting the central vertex as vi, in which case vf will be one of the outer vertices, which will have a charisma of 1. We have a probability of 8/9 of selecting one of the outer vertices as vi, in which case vf will be the central vertex, with charisma 8. Thus the expected charisma is (1/9)*1 + (8/9)*8 = 65/9, which is much greater than 16/9, the expected value we obtained from process 1.

So with process 1, we’ll usually pick one of the outer vertices, which has a low charisma. With process 2, we’ll usually start at one of the outer vertices, and then move to the central vertex, which has a high charisma. This is why the expected value of process 2 is larger.

Maybe you’ve noticed that the expected value of process 1 is always twice the number of edges divided by the number of vertices. This will soon turn out to be important for…

The Proof:

Let G = (V, E) be a finite graph such that each vertex lies on at least one edge. If x is a vertex, then let friends(x) indicate the set of vertices which share an edge with x. Note that it is necessary to require each vertex to have at least one friend in order to define process 2.

I will be using the pound sign (#) to indicate the cardinality of a set. Thus #V is the number of vertices in the graph, #E is the number of edges, and #friends(x) is the charisma of vertex x.

Let’s look at process 1. The probability of selecting any given vertex is 1/#V, so the expected value of the process is:

EV(P1) = Σx∈V 1/#V * #friends(x)

We can factor 1/#V out of the sum, obtaining:

EV(P1) = (1/#V) Σx∈V #friends(x)

Since each edge is counted twice (e.g. if x and y are friends, the friendship adds 1 to the charisma of both x and y), the sum is simply twice the number of edges, so

EV(P1) = 2 * #E / #V

Which can be written as

EV(P1) = (1/#V) Σ{x, y} ∈ E 2

That is, each edge contributes 2/#V to the expected value.

Now let’s look at process 2. If our initial vertex is x, then our final vertex will be some y ∈ friends(x), and its charisma will be #friends(y). Once we select x as our initial vertex, each friend of x has a probability of 1/#friends(x) of being selected as the final vertex. So the expected value of process 2 given that x is the initial vertex is:

EV(P2 | vi = x) = Σy∈friends(x) #friends(y) / #friends(x)

Each vertex has a probability of 1/#V for being selected as vi, so the expected value becomes:

EV(P2) = (1/#V) Σx∈V Σy∈friends(x) #friends(y) / #friends(x)

Now we want to rewrite this as a sum over edges instead of a sum over vertices. Let’s consider the edge e = {x, y}. Process 2 involves crossing an edge from the initial to the final vertex. What is the probability that the edge crossed will be e, and if we do cross e, what value should we expect the process to return?

There are two ways this can happen. First, we could start on vertex x and cross the edge to vertex y. The probability of this is 1 / (#V * #friends(x)). The resulting charisma would be #friends(y), so this possibility contributes (1/#V) * (#friends(y) / #friends(x)) to the EV of the process.

Alternatively, we could start on vertex y and cross the edge to vertex x. This would contribute (1/#V) * (#friends(x) / #friends(y)) to the EV of the process. Summing over all edges, we obtain:

EV(P2) = (1/#V) Σ{x, y}∈E #friends(x) / #friends(y) + #friends(y) / #friends(x)

Compare this to:

EV(P1) = (1/#V) Σ{x, y} ∈ E 2

Now we wish to show that EV(P1) ≤ EV(P2). That is, we want to show that:

(1/#V) Σ{x, y} ∈ E 2 ≤ (1/#V) Σ{x, y}∈E #friends(x) / #friends(y) + #friends(y) / #friends(x)

Multiplying both sides by #V, we get:

Σ{x, y} ∈ E 2 ≤ Σ{x, y}∈E #friends(x) / #friends(y) + #friends(y) / #friends(x)

Since both sums have the same number of terms, it would be sufficient to show that each term on the left side is less than or equal to each term on the right side. That is, to show that

2 ≤ #friends(x) / #friends(y) + #friends(y) / #friends(x)

for all x and y. For conciseness, let a = #friends(x) and let b = #friends(y). Then we need to show that

a/b + b/a ≥ 2

Multiplying both sides by ab, we obtain:

a2 + b2 ≥ 2 ab

a2 + b2 − 2 ab ≥ 0

(ab)2 ≥ 0

…which we know to be true, because the square of a real number is never negative. Now we’re done.

Now What?

Now we know that you are less charismatic than your friends, at least if you’re an average guy like me. More precisely, we know that the expected charisma of a randomly-selected person is ≤ that of a randomly-selected friend of a random person.

Did you have fun? Because here are a couple of puzzles for you to think about:

  • Under what conditions will the expected values of the two processes be exactly equal?
  • Can you think of any other phenomena similar to this one?

Okay, thanks for tuning in! See you next time.

Powered by WordPress