UK Lotto Draw Results.
UK Thunderball Draw Results.
UK EuroMillions Draw Results.
Analysis of the UK Thunderball Draw.
Visualization of Lottery Numbers.
The Case for Computerised Draws.
Ever since the first games of chance were invented people have been searching for ways to beat the odds. Twenty-One (also known as Blackjack) and Roulette yielded in the 1960s and 1970s respectively: Twenty-One to card counting, Roulette to physics. However, there is another game which its players hope one day may also fall - or at least partially give way - and that is the lottery.
There are as many predictive systems and methods as there are lottery players. Dozens of books and computer programs have been written in an attempt to predict lottery numbers. The truth is, none of the methods they describe or use work with anything approaching the hoped-for reliability. The only people who profit from such schemes are the authors.
And so the holy grail of gambling remains undiscovered, or so it would seem. What I am going to describe here is a method which I believe comes closer than anything else to being able to forecast the results of any lottery draw. Rather than a system developed from scratch, it is the application of a mathematical method to lottery results, and that method is Recurrence Analysis (also referred to as recurrence quantification analysis.) First, some history.
Prior to the twentieth century, the observable world had been studied using the physics of Isaac Newton and the geometry of Euclid. It appeared that the entire universe was nothing more than a giant machine, a cosmic clock which could be described with complete accuracy using the known mathematical laws. Scientists believed the whole of reality could be encapsulated in a mere handful of well-behaved theories and equations. The first real cracks in this rigid, linear edifice appeared in the late 19th century with the great French mathematician Henri Poincare and his work on dynamical systems. In his 1903 essay Science and Method, Poincare considered what is now known as sensitivity to initial conditions.
The next important discovery came in 1905 with the publication of Einstein's Special Theory of Relativity, and General Relativity in 1915. Matter and energy were interchangeable, gravity was curved space, and time ran at different rates for different observer velocities. The dawning of quantum mechanics (which Einstein also helped to bring about with his description of the photoelectric effect) spelt the end of the two-hundred year dominance of Newtonian laws.
Even in pure mathematics it had been observed that there were "pathological" functions - functions that are continuous but nowhere differentiable - also known since the late 19th century. Such functions were considered bizarre but unimportant curiosities, and ignored for decades. Another unexplored discovery was that of iterating functions. Prior to the development of the electronic computer, such a task was unremittingly laborious. In 1963 the American mathematician Edward Lorenz caught the first glimpse into the nature of chaos when running a simple computer simulation of the weather (a thermal convection model.) His system was composed of three differential equations, the attractor which now bears his name. Successive runs of the model diverged from each other due to rounding errors - the first recorded case of sensitivity to initial conditions.
The next major attack on linearity came in 1975, when the Polish-born French mathematician Benoit Mandelbrot published Les objets fractals, forn, hasard et dimension, revised and republished in 1982 as the celebrated The Fractal Geometry of Nature. The previous year Dutch mathematician Floris Takens published his Delay Embedding Theorem. The floodgates opened on nonlinear research. A whole range of natural phenomena for which linear laws had been inadequate could now be studied and described. Building on Takens' work, J.P.Eckmann, S.O.Kamphorst and D.Ruelle published Recurrence plots of dynamical systems in 1987, followed by Charles Webber and Joseph Zbilut's Embeddings and delays as derived from quantification of recurrence plots in 1992.
Recurrence analysis is now one of the major tools in the study of noisy or chaotic dynamical systems. It is used on a wide variety of natural physical phenomena (and one man-made system: the stock market.) Webber and Zbilut used recurrence analysis to study cardiac anomalies, which eventually led to an electronic means of predicting heart attacks up to fifteen minutes before they occurred.
Probability theory allows us to calculate the odds of something happening, and statistics tells us about what has happened, but neither can tell us what will happen. Lottery results constitute a chaotic dynamical system, and as such require a nonlinear method for their prediction. This is where recurrence analysis comes into its own.
With the data loaded, you will be presented with a phase space portrait (the values of dimension, delay, distances and colours can be manipulated to reveal structure using the buttons and sliders on the right):
Although at this point you can use VRA's 'Mutual Information' and 'False Nearest Neighbours' screens to determine delay and dimension, it is simpler and quicker to go directly to the prediction screen and experiment with the various parameters there (in fact, this trial-and-error approach has proved the quickest way to build up tables of values of dimension and delay required for lottery prediction.)
Click on 'Analysis', then 'General Nonlinear Analysis', then 'Time Series Prediction'. The following screen requires some explanation. First, set 'Training set' to 'Static'. The training data runs from the first point in the data up to the last. After this, set the prediction range. For a lottery draw of six numbers, set this for six numbers ahead of the last training data point.
In 'Model Options', set the 'Type' to 'Multi-step'. Next, choose the Predictor; Nearest Neighbour tends to be the most stable and reliable model for lottery numbers. After this, choose 'Distance' (this is a geometric measure between vectors); Euclidean measure has so far proved the best. Tick the 'Single neighbour from each orbit' box.
Set 'Dimension' and 'Delay'. VRA default values are 3 and 10. Lower the delay to 5 to start off with (a low value for dimension is desirable, because as the dimension increases the size of the data set required to fill the state space increases exponentially.) In 'Neighbourhood' click on '# of neighbours'. Finally, click on 'Start Predictions'. The red line shows the predicted values. These are read off from the vertical axis on the left (labelled 'Actual & Predicted'):
Reading off the graph, the predicted numbers are 8, 25, 34, 38, 39 and 46. The prediction graph can be saved (click on 'Save Chart') and loaded into a graphics editor such as MS Paint for magnification, as it can sometimes be difficult reading the exact numbers off the vertical axis. VRA also has a built-in zoom function: hold down the left mouse button and drag the pointer over the area of interest. To zoom out drag in the opposite direction (bottom right to top left.)
Occasionally VRA generates an anomalous prediction, one in which a number is repeated, such as in the following prediction for a EuroMillions draw:
The values are 8, 8, 10, 47, 49, with 'lucky stars' 1 and 7. The value 8 has appeared twice! The prediction is therefore one number short of the required seven numbers (five main balls plus two lucky stars.) In such a case I simply assume that VRA had intended to generate two consecutive numbers, and so the second 8 is taken to be 9.
So far VRA can accurately predict up to three numbers on a fairly regular basis, though three numbers only wins you �10 in the UK main draw. I use results after they have been placed into ascending numerical order. This introduces an extra degree of order, thereby improving the accuracy of VRA's predictions. By going back over previous lottery results one draw at a time and trying different values of dimension and delay, a table of values can be built up to see which are the commonest (this process is called 'postdicting', as opposed to predicting.) See the Thunderball analysis page for an example.
Can I justify my use of recurrence analysis? Yes. I consider the various books and programs for lottery prediction to be worthless. Once in a while the systems they present might get lucky and deliver an accurate prediction, but that's the point: it won't be a genuine prediction, only a lucky guess. True prediction should be more reliable than that.
Some people use the concepts of "hot" and "cold" numbers, both of which are flawed (balls do not wait their turn to be chosen.) Over time, the distribution of drawn numbers will level out (it might take hundreds or even thousands of draws to do so, but eventually it will, unless the draw machine is biased in some way.) Too many systems and methods simply look at the numbers, rather than the underlying process: the draw machine itself. The numbers are there only to keep track of the balls; you could have any set of symbols printed on them. What matters is how the balls behave, not what is on them. The motions of the balls are ultimately determined by the laws of physics, and constrained by the design of the draw machine. With the exception of quantum-level phenomena, no physical system is completely unpredictable. Recurrence analysis effectively provides a statistical interpretation of the underlying physics.
Over time, the physical dimensions and features of the draw machine impose a degree of repetition on the way the balls move. I call this signature "characteristic behaviour", something which all machines have. Barring component failure, over time a given machine will exhibit the same behaviour (excluding scientific curiosities like the Lorenz waterwheel, whose behaviour is chaotic.) Different designs of lottery machine will have subtly different characteristic behaviours. These are exhibited by patterns in the balls, which recurrence analysis can detect.
One objection to this approach is that a set of machines rather than a single machine is used to draw the numbers. However, I believe mechanical differences between machines of the same design are so slight that they can safely be ignored. Besides, it is not possible to know before each draw which machine is being used, so the assumption has to be made that it makes too little difference to be worth considering.
Can characteristic behaviour be quantified? Yes, and an entire branch of engineering is devoted to measuring the operating characteristics of machines. If you graph the behaviour of components over time such as velocity, frequency, and power, patterns emerge (the balls in a lottery machine can even be considered as another set of components). One equation I have developed for lottery machines uses six parameters:
Dimensions, d : the size of a lottery machine chamber, given by d = length x breadth x height
Shape, s : a measure of the shape of a lottery machine chamber; 1 for a sphere, 2 for a cylinder, 3 for a cube.
Rotor arms, a : the number of arms on the rotor (a = 0 if the machine uses a blower to mix balls).
Rotor speed, r : the maximum speed at which the rotor turns (again, r = 0 if the machine uses a blower).
Ball columns, b : the number of columns into which balls are arranged prior to being released.
Mixing time, t : the time which balls are mixed for before the first one is drawn.
The formula for characteristic behaviour is: Cb = d + s + a + r + b + t
This formula should yield a unique value for any given lottery machine design. After a recurrence model has been found for a given lottery machine, its Cb value can then be calculated. For a new design of lottery machine, its Cb value can be calculated first, then compared to those of pre-existing machines. It should then be possible to determine which recurrence parameters might work best for the new machine, avoiding the task of trying dozens of values of dimension and delay.
Another way to choose numbers would be to use a computer simulation of a lottery machine. A first approximation would be to model only the motions of the balls. Each ball would have its own subroutine, to handle its x,y,z coordinates and velocity. The physics engine would call each ball routine in turn and look for collisions, updating each routine accordingly. I suggest a time step of 1/100 of a second. The program would look something like this:
1 Initialise variables: time, positions, velocities, ball count
2 Set balls in motion; start timer
3 Call each ball subroutine in turn; check for collisions; miss out drawn balls
4 Increment timer by 1/100 second
5 Draw a ball after a specified time; decrement ball count
6 Repeat from 3 until all required balls have been drawn
To keep the model simple, two assumptions have to be made, namely that the balls are point masses, and that collisions between them are elastic (the kinetic energy of the system is the same before and after collision; no energy is lost as other forms).
Although recurrence analysis provides a statistical view of the underlying physics of the lottery machine, do the lottery numbers have an attractor? From my work so far the answer appears to be yes, and it is most likely a strange attractor (in fact, it is the lottery draw machine that generates the attractor; the form of the attractor is ultimately determined by the characteristics of the machine).
Several other questions remain regarding lottery prediction, some of which I am starting to find answers to:
Would analysing the results by column have any effect on accuracy? Single columns appear to give the worst prediction rate. Splitting results into groups of three columns gives more accurate results than six or seven columns taken together (see the Thunderball analysis page). There are several ways to arrange lottery data, and some do work better than others (note that VRA itself loads in data as a single string of numbers regardless of the arrangement).
Do different lottery games have different values of dimension and delay? The answer is probably yes; different designs of lottery machine will have different characteristic behaviours, and therefore different dimension and delay values.
What is the theoretical maximum number of accurate predictions? There is nothing to say all six numbers cannot be predicted (though it won't happen very often). The frequency of regularly accurate predictions will decrease the more numbers you try to predict in one go, so it's better to try for three or four numbers than all six.
Do the values of dimension and delay for a given lottery change over time? If the same draw method is used regularly, then ultimately dimension and delay should remain the same over time.
Do lottery results exhibit stationarity? (do the values of statistical measures like mean, variance, and standard deviation remain the same over time?) The question of stationarity is one of the most important. Put simply, a system exhibiting nonstationarity is more unpredictable over the long term than one with stationarity (though it may be predictable over short periods). If the lottery results are non-stationary the values of dimension and delay will gradually change over time (and perhaps even the the type of Predictor and Distance measure). Again, see the Thunderball analysis page for some good empirical evidence of stationarity.
I don't claim using recurrence analysis will guarantee your chances of a win, but I do claim it will improve your chances. I first downloaded VRA in late August 2001, after surfing the web for interesting mathematics software. After reading the documentation I realised here was a way to do what no one else had done, and predict the unpredictable (though it has taken five years to do it, most of which was waiting to collect enough data for VRA to make reliable predictions with). My current computer is a Packard Bell 3.06 GHz Pentium 4 with 1 Gb RAM, Radeon Sapphire X1650 Pro graphics card (512 Mb), 40 Gb and 120 Gb hard drives, running Windows XP.
You can contact me at: email@example.com
I present here the sources I use in my work: books, links to sites about chaos and nonlinear science, recurrence analysis, mathematics, software, and lotteries:
An Introduction to Computational Physics by Tao Pang, Cambridge University Press, ISBN 0-521-48592-4
The Eudaemonic Pie by Thomas Bass, Authors Guild Backinprint, ISBN 0-595-14236-2 (not a technical work, but an enjoyable read)
Nonlinear Time Series Analysis by Holger Kantz and Thomas Schreiber, Cambridge University Press, ISBN 0-521-65387-8
Nonlinear Time Series Analysis by Cees Diks, World Scientific, ISBN 981-02-3505-4
Details of the book Chaos and Time-Series Analysis by Dr. Julien Sprott
Chaos, fractal, and nonlinear research
Chaos at the University of Maryland
Chaos: An Interdisciplinary Journal of Nonlinear Science
The Chaos Hypertext Book
Czech research paper on deterministic chaos (in English)
Deterministic Chaos Group
Glossary of terms used in time series analysis of cardiovascular data
Fractals and Chaos: links and images
Glossary of time series analysis terms
Interpreting recurrence plots
Lexicon of Chaos Theory terms
Math Archives list of nonlinear dynamics sites
The Nonlinearity and Complexity Home Page
The Nonlinear Science FAQ
Recurrence plots and cross recurrence plots
Spanky Fractals: articles, images and software
Wikipedia entry about attractors
Wikipedia entry about recurrence quantification analysis
Wikipedia entry about Takens' theorem
Archive of various maths and physics research papers
Automatic Calculus and Algebra: solve problems online
Citebase: search engine for research papers and publications
Dynamic Directory: various maths resources, including software
Formulae for machine reliability
Math Archives: huge list of mathematics resources
Names for large numbers
Scirus: search engine for research papers and publications
University of Southampton work on lottery numbers
Wolfram math world
Analysis and Visualization of Nonlinear Time Sequences
Better Nonlinear Models from Noisy Data: Attractors with Maximum Likelihood
Chaos or Noise - Difficulties of Distinction
Delay Embeddings for Forced Systems: Stochastic Forcing
Detecting Dynamical Nonstationarity in Time Series Data
E101 Chaos course paper 4-1
E101 Chaos course paper 5-6
Global Reconstruction from Nonstationary Data
Improved Correlation Dimension Calculation
Influence of Observational Noise on Recurrence Quantification Analysis
Interdisciplinary Application of Nonlinear Time Series Methods
Inverting chaos: Extracting system parameters from experimental data
Models for Time Series
Nonlinear Science FAQ
Predictability: A Way to Characterize Complexity
Reconstruction of Dynamical and Geometrical Properties of Chaotic Attractors
Recovering Smooth Dynamics from Time Series with the Aid of Recurrence Plots
Recurrence Analysis for Detecting Nonstationarity and Chaos
Recurrence Plot Statistics and the Effect of Embedding
Recurrence Plots and Unstable Periodic Orbits
Recurrence Plots in Nonlinear Time Series Analysis: Free Software
Royal Statistical Society 2004 report on the UK lottery
Space Time-Index Plots for Probing Dynamical Nonstationarity
Statistics for Continuity and Differentiability
Visual Recurrence Analysis as an Alternative Framework for Time Series Characterisation
Visualization and Detection of Coupling in Time Series by Order Recurrence Plots
Wavelet Reconstruction of Nonlinear Dynamics
Dataplore (version 2.2-2), comprehensive data analysis package
Download page for Analysis and Visualization of Time Sequences (AVTS)
Download page for Chaoscope (version 0.2.1), generates strange attractors
Download 2.2 Mb zip file of Eugene Kononov's Visual Recurrence Analysis (version 5.01, May 2007)
Dr. Charles Webber's Recurrence Quantification Analysis (version 9.1)
Dr. Julien Sprott's software site
Fractan (version 4.3), a program to carry out fractal analysis of time series data
Fractint (version 20.0), the best fractal program available
List of time series analysis software
Lottery wheeling program
Mathematical/computational software archive
Time Series Analysis, TISEAN (version 2.1)
Wolfram Research, the home of Mathematica