There are no positive integers x,m,y,n,z,r satisfying the equation
x^{m} + y^{n} = z^{r}where m,n,r > 2 and x,y,z are co-prime (that is, gcd(x,y) = gcd(y,z) = gcd(x,z) = 1). |
There is a $75,000 prize for the first proof or disproof of the conjecture. The conjecture is obviously related to Fermat's Last Theorem, which was proved true by Andrew Wiles in 1994. A wide array of sophisticated mathematical techniques could be used in the attempt to prove the conjecture true (and the majority of mathematicians competent to judge seem to believe that it likely is true).
Less sophisticated mathematics and computer programming can be used to try to prove the conjecture false by finding a counterexample. This page documents my progress in this direction.
There are two things that make this non-trivial. First, we quickly get beyond the range of 32 or 64 bit integers, so any program will need a way of dealing with arbitrary precision integers. Second, we need to search a very large space: there are six variables, and the only constraint on them is that the sum must add up.
Suppose we wanted to search, all bases (x,y,z) up to 100,000 and all powers (m,n,r) up to 10. Then the first problem is that we will be manipulating integers up to 100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 (that is, 10^{50}). And we will be dealing with 1,000,000,000,000,000,000 (that is, 10^{18}) potential combinations of integers.
The "large integer" problem requires either a language that supports such integers directly, such as Lisp or Python, or a package that adds the functionality, such as the NTL package for C++.
The "many combinations" problems requires a clever search algorithm. As a start, we can cut the space in half (without loss of solutions) by only considering combinations with x less than or equal to y. This helps a little, getting us down to 0.5 * 10^{18} combinations. The next step is to eliminate the need to consider combinations of z, r. The idea is that we can precompute all z^{r} combinations and store them in a hash table. Then when we get a x^{m} + y^{n} sum, we just look it up in the table, rather than enumerating all z, r pairs. This gets us down to 0.5 * 10^{12} combinations, at the cost of having to store 10^{6} large integers. Both storage and projected computation times are now within range of what I can expect to handle on my home PC.
Now its time to implement the program. I chose Python as the implementation language because it happened to already be installed on the machine I had (I also had Lisp installed, but it was an evaluation copy that was limited in the amount of memory allowed, so I couldn't use it for this problem). Python is the slowest of the choices in terms of execution speed, but it allowed me to program a solution in only a few hours.
The Python program is very straightforward: enumerate all x,y,m,n combinations, and for each one check if the sum of the exponents is in the table. Initial testing on small search spaces was rewarding: it took only 3.3 minutes to verify that there are no solutions with x,y,m,n less than 100. I was then ready to scale up, but first I made some very minor improvements to the algorithm: instead of looking up the sum in the table with table.get(x**m + y**n), I moved each part of the calculation out of as many loops as possible, and pre-computed as much as possible. In other words, I changed from the original:
def beal(max_base, max_power): bases, powers, table = initial_data(max_base, max_power) for x in bases: for y in bases: if y > x or gcd(x,y) > 1: continue for m in powers: for n in powers: sum = x**m + y**n r = table.get(sum) if r: report(x, m, y, n, nth_root(sum, r), r) |
def beal(max_base, max_power): bases, powers, table, pow = initial_data(max_base, max_power) for x in bases: powx = pow[x] for y in bases: if y > x or gcd(x,y) > 1: continue powy = pow[y] for m in powers: xm = powx[m] for n in powers: sum = xm + powy[n] r = table.get(sum) if r: report(x, m, y, n, nth_root(sum, r), r) |
beal( 100, 100) took 3.3 mins ( 9K elements in table) beal( 100,1000) took 19.3 hours ( 92K elements in table) beal( 1000, 100) took 6.2 hours ( 95K elements in table) beal( 10000, 30) took 52 hours ( 278K elements in table) beal( 10000, 100) took 933 hours ( 974K elements in table) beal( 50000, 10) took 109 hours ( 399K elements in table) beal(100000, 10) took 445 hours ( 798K elements in table) beal(250000, 7) took 1323 hours (1249K elements in table)To put it another way, in the following table a red cell indicates there is no solution involving a b^{p} for any value of b and p less than or equal to the given number. A yellow cell with a "?" means we don't know yet. Run times are given within some cells in minutes (m), hours (h) or days (d).
p=7 | p=10 | p=30 | p=100 | p=1000 | |
---|---|---|---|---|---|
b=100 | - | - | - | 3m | 19h |
b=1,000 | - | - | - | 6h | ? |
b=10,000 | - | - | 2d | 39d | ? |
b=100,000 | - | 18d | ? | ? | ? |
b=250,000 | 55d | ? | ? | ? | ? |
How do we know the program is correct? We don't for sure, but removing "or gcd(x,y) > 1" prints some non-co-prime solutions that can be verified by hand, suggesting that we're doing something right.
"""Print counterexamples to Beal's conjecture (http://www.bealconjecture.com). That is, find positive integers x,m,y,n,z,r such that: x^m + y^n = z^r and m,n,r > 2 and x,y,z co-prime (pairwise no common factor). ALGORITHM: Initialize the variables table, pow, bases, powers such that: pow[z][r] = z**r table.get(sum) = r if there is a z such that z**r = sum. bases = [1, 2, ... max_base] powers = [3, 4, ... max_power] Then enumerate x,y,m,n, and do table.get(pow[x][m]+pow[y][n]). If we get something back, report it as a result. We consider all values of x,y,z in bases, and all values of m,n,r in powers. """ def beal(max_base, max_power): bases, powers, table, pow = initial_data(max_base, max_power) for x in bases: powx = pow[x] if x % 1000 == 0: print 'x =', x for y in bases: if y > x or gcd(x,y) > 1: continue powy = pow[y] for m in powers: xm = powx[m] for n in powers: sum = xm + powy[n] r = table.get(sum) if r: report(x, m, y, n, nth_root(sum, r), r) def initial_data(max_base, max_power): bases = range(1, max_base+1) powers = range(3, max_power+1) table = {} pow = [None] * (max_base+1) for z in bases: pow[z] = [None] * (max_power+1) for r in powers: zr = long(z) ** r pow[z][r] = zr table[zr] = r print 'initialized %d table elements' % len(table) return bases, powers, table, pow def report(x, m, y, n, z, r): x, y, z = map(long, (x, y, z)) assert min(x, y, z) > 0 and min(m, n, r) > 2 assert x ** m + y ** n == z ** r print '%d ^ %d + %d ^ %d = %d ^ %d = %s' % \ ( x, m, y, n, z, r, z**r) if gcd(x,y) == gcd(x,z) == gcd(y, z) == 1: raise 'a ruckus: SOLUTION!!' def gcd(x, y): while x: x, y = y % x, x return y def nth_root(base, n): return long(round(base ** (1.0/n))) import time def timer(b, p): start = time.clock() beal(b, p) secs = time.clock() - start return {'secs': secs, 'mins': secs/60, 'hrs': secs/60/60} |
I think the obvious next step is to re-implement the algorithm in C for speed (probably with a special-purpose hash table implementation tuned for the problem), and with arithmetic done module 2^{32} (or maybe 2^{64}). When we get a solution modulo 2^{32}, it may or may not be a real solution, but we could print it to a file and afterwards run a simple Python or Lisp program to determine if it is a real solution. This approach will probably run about 100 times faster, and require about 1/5th to 1/10th the storage, so it can take us out to maybe 1,000,000^{20}.
Beyond that, we need a new approach. My first idea is to only do a slice of the z^{r} values at a time. This would require switching from an approach that limits the bases and powers to one that limits the minimum and maximum sum searched for. That is, we would call something like beal2(10L ** 20, 10L ** 50) to search for all solutions with sum between 10^{20} and 10^{50}. The program would build a table of all z^{r} values in that range, and then carefully enumerate x,m,y,n to stay within that range. One could then take this program and add a protocol like SETI@home where interested people could download the code, be assigned min and max values to search on, and a large network of people could together search for solutions.
Ultimately, I think that if there are any counterexamples at all, they will probably be found through generating functions based on elliptical curves, or some other such approach. But that's too much math for me; I'm just trying to apply the minimal amount of computer programming to look for the "obvious" counterexamples.