Advanced NFL Stats

Jan 31, 2009

Super Bowl XLIII Prediction

Win probabilities for Super Bowl XLIII are listed below. More info after the jump.

P_win	Super Bowl XLIII	P_win
0.69	PIT at ARI	0.31

The probabilities are based on an efficiency win model explained here and here with some modifications. The model considers offensive and defensive efficiency stats including running, passing, sacks, turnover rates, and penalty rates. Team stats are adjusted for previous opponent strength.

The probabilities based on full regular season statistics would be 0.74 to 0.26 in favor of Pittsburgh. The efficiency stats of each opponent is listed in the table below.

I was going to do a write up of each facet of the match-up, but I think the table says a thousand words. The one thing I will point out however, is Pittsburgh's pass defense. It gives up only 4.3 yards per drop back, and almost 4% of all passes are intercepted. Since the 2002 season, the next best pass defense was the '02 Buccaneers who gave up 4.5 yds per attempt. Baltimore's '03 defense was third giving up 4.8 yds per attempt. That's three standard deviations better than the mean. In comparison, Arizona's passing offense is 1.4 standard deviations better than the mean. And that's why the Steelers are a strong favorite.

TEAM	OPASS	ORUN	OINT %	OFUM %	DPASS	DRUN	DINT %	PENRATE
ARI	7.1	3.5	2.4	2.8	6.5	4.0	2.5	0.39
PIT	5.9	3.7	3.0	2.6	4.3	3.3	3.8	0.41
NFL Avg	6.1	4.2	2.8	2.3	6.2	4.2	2.8	0.36

Weekly Roundup 1/29/09

The guys at Cold Hard Football Facts think they've found evidence vindicating the 'frequent running causes winning' fallacy. But I don't think they did at all. Yes, teams that run more often in the Super Bowl (and all other games) also tend to be the teams that win the game. But we all know that it's the lead that allows all the rushing. Take the 2000 Ravens-Giants Super Bowl. The Ravens ran 33 times compared to 15 for the Giants. But 40% of Baltimore's runs were in the 4th quarter after they already had a 24-7 lead. Oddly, the article address the correlation-causation fallacy, but then just says "it's up for debate."

Football Outsiders looks at all the silly prop bets available for the Super Bowl. My favorite is the one about Matt Millen picking the winner in the pre-game show. I have to think that there is so much negativity surrounding that guy that there is an arbitrage opportunity there. (By the way, I noticed FO has been banned from Google. They must have been caught gaming the search algorithms.)

Speaking of silly betting, here is PFR's Super Bowl squares post. And if you're playing SB squares, you'll probably want to keep an eye on the win probability site. The probability of a current drive ending in a TD or FG is available in real-time.

PFR also responds to an article from the Community site asking whether all 10-point leads are equal. Basically, the question is does a 30-20 lead have the same win probability as a 13-3 lead? The answer is no, they're not exactly the same. The 13-3 lead is slightly safer. But it really depends on home field advantage and relative team strength more than whether it's a 30-20 or 13-3 type lead. Pretty interesting, and this kind of stuff has direct applications for the win probability engine.

I wonder who the Derek Jeters of football are?

The Patriots are already 6 to 1 favorites to win the Super Bowl next year.

The Numbers Guy looks at why QBs almost always get named the Super Bowl MVP. A bunch of football stat-heads, including myself, toss in their 2 cents. The author wanted to know if there was a statistical way to isolate the contribution of a single player in a game. My idea was actually "n-player cooperative stability equilibria," but thankfully it was translated into "a market-based approach."

An article from Slate.com about the overtime rules likes the "field position auction" idea. Phil Birnbaum agrees. I think ideas like those are clever and effective solutions, but the NFL is unlikely to make changes that veer too far from tradition.

The problem with overtime isn't really the coin flip, it's the incredible range and accuracy of modern kickers. The entire sport has been slowly warped into fieldgoal-ball. Overtime is just where the problem becomes most obvious. I'd suggest solving the issue by 1) narrowing the goal posts, and 2) moving the kickoff line back to the 40 for overtime.

Jan 30, 2009

Shameless Plug for Win Probability Site

Amaze your friends. Wow your family. Dazzle your co-workers. Confuse your brother-in-law.

Don't forget to fire up the Super Bowl XLIII in-game win probabilities this Sunday. Them: "Oh man, they'll never come back from that lead!" You: "Actually, they have a 15% chance of coming back" Them: "Shut up, you smart ass. Why do you have to suck the fun out of everything?"

Actually, that could be a pretty good catch line for my site. "Advanced NFL Stats: Sucking all the fun out of football since 2007!"

And as a bonus, for every click on the win probability site Sunday, a portion of the proceeds will go to the Get a Steeler Fan His G.E.D. Fund.

Jan 29, 2009

Super Bowl Winner Stats

Last year I looked at how often teams won playoff games based on their season-long performance in various categories. I basically looked at how predictive various stats are in forecasting playoff wins. For example, how often does the team with the better passing efficiency win? I learned some interesting things such as how important run defense appears to become in the playoffs.

This time around I looked at just Super Bowls. How often does the team with the better season-long performance in each category win the big game? The sample size is very small, but the Super Bowl is a unique game in many ways, so we might learn something.

I only looked at Super Bowls since Super Bowl XV, the 1980 game between the Raiders and Eagles. In 1978, the passing rules changed and significantly altered the sport, but the league did not adjust for another couple years. Yes, this shrinks the sample to just 28 games, but I’m not going for statistical conclusions, just an initial look to see if anything stands out. No fancy regressions this time, just straightforward percentages.

The results are in the table below. You can read it as saying “the team with the better [efficiency stat] won the Super Bowl [x%] of the time.”

Stat	Win %
O Pass	50
O Run	57
O Int	61
D Pass	61
D Run	46
D Int	54

I’m a little surprised that the team with the better offensive passing efficiency only won 50% of the time. I’d think that would be a fairly solid advantage. Defensive passing looks like it might be the more important category.

Also, defensive run efficiency doesn’t appear to hold the same importance that it has in the playoffs lately. The better run-stopping team only won 46% of the time.

But again, we can’t really draw any conclusions. There are only 28 games in the sample, so a single game swings the percentage by about 3%.

In case you’re curious, the Steelers have the advantage in offensive running and all the defensive categories. The Cardinals have the better offensive passing efficiency and the lower interception rate.

Jan 27, 2009

How the Model Works--A Detailed Example Part 2

This is a continuation of an article that details exactly how my predictions and rankings are derived. You can read part 1 here. To recap, I'm using the Super Bowl match-up between the Steelers and Cardinals as an example. So far, we've used a logistic regression model based on team efficiency stats to estimate the probability each team will win.

We haven't accounted for strength of schedule yet. For example, the Steelers may have the NFL's best run defense, yielding only 3.3 yds per rush. But is that because they're good or because their opponents happened to have poor running games?

To adjust for opponent strength, we'll first need to calculate each team’s generic win probability (GWP), or the probability of winning a game against a notional league-average opponent at a neutral site. This would give us a good estimate of a team’s expected winning percentage based on their stats.

Since we already know each team’s logit components, all we need to know is the NFL-average logit. If we take the average efficiency stats and apply the model coefficients we get Logit (Avg) = -2.52.

Therefore, for the Cardinals, a game against a notional average opponent would look like:

Logit = Logit (ARI) – Logit (Avg)
= 0.07

The odds ratio is e^0.07 = 1.09. Arizona’s GWP is 0.52—just barely above average. If we do the same thing for Pittsburgh, we get a GWP of 0.73. And it’s easy enough to do for all 32 teams. In fact, that’s what we need to do for our next step in the process, which is to adjust for average opponent strength.

The GWPs I calculated for Arizona and Pittsburgh were based on raw efficiency stats, unadjusted for opponent strength. That’s ok if we assume they had roughly the same strength of schedule. But often teams don’t, especially in the earlier weeks of the season.

To adjust for opponent strength, I could adjust each team efficiency stat according to the average opponents’ corresponding stat. In other words, I could adjust the Cardinals’ passing efficiency according to their opponents’ average defensive efficiency. I’d have to do that for all the stats in the model, which would be insanely complex. But I have a simpler method that produces the same results.

For each team, I average its to-date opponents’ GWP to measure strength of schedule. This season Arizona’s average opponent GWP was 0.51—essentially average. I can compute the average logit of Arizona’s opponents by reversing the process I’ve used so far.

The odds ratio for the Cardinals’ average opponent is 0.51/(1-0.51) = 1.03. The log of the odds ratio, or logit, is log(1.03) = 0.034. I can add that adjustment into the logit equation we used to get their original GWP.

Logit = const + Logit(ARI) – Logit(Avg) + 0.034
= 0.11

This makes the odds ratio e^0.11 = 1.12. Their GWP now becomes 0.53. If you think about it intuitively, this makes sense. Their unadjusted GWP was 0.51. They (apparently) had a slightly tougher schedule than average. So their true, underlying team strength should be slightly higher than we originally estimated.

I said ‘apparently’ because now that we’ve adjusted each teams GWP, that makes each team’s average opponent GWP different. So we have to repeat the process of averaging each team’s opponent GWP and redoing the logistic adjustment. I iterate this (usually 4 or 5 times) until the adjusted GWPs converge. In other words, they stop changing because each successive adjustment gets smaller as it zeroes in on the true value.

Ultimately, Arizona’s opponent GWP is 0.50 and Pittsburgh’s is 0.53. After a full season of 16 games, strength of schedule tends to even out. But earlier in the season one team might have faced a schedule averaging 0.65 while another may have faced one averaging 0.35.

My hunch is that it’s this opponent adjustment technique that gives this model its accuracy. It’s easy enough to look at a team’s record or stats to intuitively assess how good it is, but it’s far more difficult to get a good grasp of how inflated or deflated its reputation may be due to the aggregate strength or weakness of its opponents.

Now that we’ve determined opponent adjustments, we can apply them to the game probability calculations. The full logit now becomes:

Logit = const + home field + (Team A logit + Team A Opp logit) –
(Team B logit + Team B Opp logit)

Pittsburgh’s opponent logit is log(0.53/(1-0.53)) = 0.10 and Arizona’s is log(0.50/1-.50) = 0.01. The game logit including opponent adjustments is now:

Logit = -0.36 + 0.72/2 + (-2.45 + 0.01) - (-1.51 + 0.10)
= -1.02

The odds ratio is therefore e-^1.02, which makes the probability of Arizona winning 0.26. This estimate, based on opponent adjustments, is slightly lower than what we got for the unadjusted estimate. This makes sense because Arizona’s strength of schedule was basically average, and Pittsburgh’s was slightly tougher than average.

So there you have it, a complete estimate of Super Bowl XLIII probabilities and a step-by-step method of how I do it.

There are all kinds of variations to play around with. You can choose which weeks of stats to use, to overweight, or to ignore. You can calculate a team’s offensive GWP by holding its own defensive stats average in the calculations, and only adjusting for opponent defensive stats. The resulting OGWP tells us how a team would do on just the strength of its offense alone. It’s the generic win probability assuming the team had a league-average defense. DGWP is vice versa.

One variation I employ is to counter early-season overconfidence by adding a number of dummy weeks of league-average data to each team's stats. This regresses each team's stats to the league mean, which reduces the tendency for team stats to be extreme due to small sample size. For example, it takes about 6 weeks for a team's offensive run efficiency to stabilize near its ultimate season-long average. So at week 3, I'll add 3 games worth of purely average performance into each team's running efficiency stat. No team will sustain either 7.5 yds per rush or 2.2 yds per rush.

This entire process might seem ridiculously convoluted, but it’s actually pretty simple. You get the coefficients from the regression. You next calculate each team’s logit with simple arithmetic. Game probabilities and “GWP” are just a logarithm away. Opponent adjustments require a little more effort, but in the end, you just add them into the logit equation.

Voila--a completely objective, highly accurate NFL game prediction and team ranking system.

Jan 25, 2009

Weekly Roundup

I was blown away last week when within hours of posting the win probability calculator, reader Zach wrote up an analysis of when to go for a 2-point conversion. Very cool.

Jim Schwartz is the new head coach of the Lions. Besides being a fellow native of Baltimore, I like him because he's known to have a solid grasp of statistics. Like Bill Belichick, Schwartz has an economics degree. The New York Times has a good write up on him from last fall.

The new issue of the Journal of Quantitative Analysis in Sports is out. There's an article on ranking teams and predicting games, including in the NFL. I've only skimmed it. There are a couple of other articles that look interesting too. There is an article on determining the evenness of sports competitions in rugby, essentially doing the same thing--ranking and forecasting. There is also an article on using neural networks to predict NBA games. (I've experimented with neural network software. I can't say I completely understand it, but I was able to get close to the same prediction accuracy from my usual regression model.)

Sometimes the articles in JQAS are crackpot nonsense. So be warned--just because something has a fancy academic title, comes wrapped in a pretty .pdf, and is loaded with references, doesn't guarantee it has any value. These particular articles don't immediately jump out as kooky, thankfully.

Math and stats pay. Check out the top 3 jobs. Funny, I don't see Navy carrier pilot on the list. When I used to fly, I often wondered how much you'd have to pay someone to do that in an open and competitive market. Take away the "serving your country" aspect, and how much money would someone with those skills make? Throw in the danger and the fact that they have to live at sea for extended periods, and you might have to pay them like these guys.

The PFR blog has the usual installments of best-ever, worst-ever trivia. This time, it's best-ever Super Bowl losers (part 2). I'd like to see worst-ever Super Bowl winners too. [Edit: Here it is.] What kills me is that the two biggest championship upsets in American sports history feature an upstart second-fiddle team from New York beating an overwhelming favorite from Baltimore. The Mets upset the O's in '69, and the winter before, the Jets shocked Baltimore in Super Bowl III. I wasn't even born yet, and it still hurts. One thing forgotten about the Super Bowl back then is that it was more of an actual bowl game--a post-season exhibition. Baltimore had already won the NFL Championship. Back then, as I understand it, the Super Bowl was a cross between a meaningless Pro Bowl-type game and the modern championship as we now know it. Not totally meaningless, but not yet considered the championship either. The Jets certainly changed that.

PFR also has a new Super Bowl history page.

Smart Football teaches us about zone blitzes.

Dave Berri has his final rankings of the year, plus he looks at the Lions.

Over at the community site, Denis O'Regan compares scoring frequency in soccer and football using Poisson distributions. Also, Oberon Faelord (real name?) reminds us that not all 10-point leads are the same.

Since the Steelers beat my Ravens last Sunday to reach the Super Bowl, I'm allowed one outburst of sour grapes. When I was in the Navy, I noticed every part of the country seemed to have a sizable stable of Steeler fans. I remember going to watch a Steelers-Browns playoff game at a sports bar in Pensacola and couldn't believe how many fans of each team were there. And here in Northern Virginia, they're everywhere. Now I understand why. I think a lot of it just bandwagon types from the 70s, but the economic dispersion of the rust-belt is also obviously part of the reason.

Jan 23, 2009

How the Model Works--A Detailed Example Part 1

One of the most common requests I get is to write up a complete sample game probability calculation. In this article, I'll explain how the model works and do a full detailed example using the upcoming Super Bowl between the Steelers and Cardinals.

When I originally constructed this model, the goal wasn’t to predict game outcomes but to identify how important the various phases of the game were compared to the others. In order to do that, I had to choose stats that were independent of the others, or at least as independent as possible.

There were several options, such as points scored and allowed, total yards, or first downs. But if I’m trying to measure the true strength of a team’s offensive passing game, passing touchdowns may not tell us much. A team may have a great defense that gives them good field position on most drives, or it might have a spectacular running back that can carry the offense into the red zone frequently. So points or touchdowns won’t work.

The other obvious option is total yards. But losing teams can accumulate lots of total passing yards late in a games “trash time.” Or a team can generate lots of pass yards simply because they pass more often. That really doesn’t tell us how good a team is at passing. Total rushing yards presents a similar problem. A team with a great passing game can build a huge lead through three quarters, and then run out the clock in the 4th quarter accumulating a lot of rushing yards.

First downs made or allowed tells us a lot about how good an offense or defense is, but it doesn’t tell us anything about the relative contributions of the running and passing game of a team.

So, the best choice is going to be efficiency stats. Net yards per pass attempt and yards per rush tells us about how good a team truly is in those facets of the game. They are also largely independent of one another—not completely, but about as independent as possible.

Turnovers are also obviously critical. But total turnovers can be misleading just like total yards. Teams that pass infrequently may have few interceptions, but it may only be because they simply have fewer opportunities. So I also use interceptions per attempt, and fumbles per play.

So the model starts with team efficiency stats. But I don’t use all of them. For example, I throw out defensive fumble rate because although it helps explain past wins or losses, it doesn’t predict future games. A team’s defensive fumble rate is wildly inconsistent throughout a season, which suggests it’s very random or mostly due to an opponent’s ability to protect the ball. Forced fumbles and defensive interceptions show the same tendency. In the end, the model is based on:

Offensive net passing yds per att
Offensive rushing yds per att
Offensive interceptions per att
Offensive fumbles per play
Defensive net passing yds per att
Defensive rushing yds per att
Team penalty yds per play
Home field advantage

The model is a regression model, specifically a multivariate non-linear (logistic) regression. I know that sounds very technical, but the general idea behind regression is pretty intuitive. If you plotted a graph of a group of students’ SAT scores vs. their GPA, you’d see a rough diagonal line.

We can draw a line that estimates the relationship between SAT scores and GPA, and that line can be mathematically described with a slope and intercept. Here, we could say GPA = 1.5 + 2 * (test score).

Regression is what puts that line where it is. It draws a line that minimizes the error between the estimated GPA and the actual GPA of each case.

We can do the same thing with net passing efficiency and season wins. We can estimate season wins as Wins = -6.5 + 2.4*(off pass eff). Take the Cardinals this year. Their 7.1 net passing yds per attempt produces an estimate of 10.7 wins. They actually won 9, so it’s not a perfect system. We need to add more information, and that’s what multivariate regression can do.

Multivariate regression works the same way but is based on more than one predictor variable. Using both offensive and defensive pass efficiency as predictors, we get:

Wins = 9.6 + 2.3*(off pass eff) – 2.6*(def pass eff)

For the Cardinals, whose defensive pass efficiency was 6.5 yds per att in 2008, we get an estimate of 9.4 wins.

Adding the rest of the efficiency stats to the regression, we can improve the estimates even further. Unfortunately, linear regression, like we just used, can sometimes give us bad results. A team with the best stats imaginable would still only win 16 games in a season, but a linear regression might tell us they should win 21. Additionally, linear regression can estimate things like the total season wins, but it can’t estimate the chances of one team beating another. That’s where non-linear regression comes in.

Non-linear regression, like the logistic regression I use, is best used for dichotomous outcomes such as win or lose. A logistic regression model can estimate the probabilities of one outcome or the other based on input variables. It does this by using a logarithmic transformation, which is a fancy way to say taking the log of everything before doing all the computations. After computing the model and its output just as you would with linear regression, you “undo” the logarithm by taking the natural exponent of the result. Technically, logistic regression produces the “log of the odds ratio.” The odds ratio is the familiar “3 to 1” odds used at the race track, which can be translated into a probability of 0.75 (to 0.25).

Logistic regression would be useful if, instead of predicting GPA, you wanted to predict a student’s probability of graduation. Graduation is a yes-or-no dichotomous outcome, and winning an NFL game is no different. We can use the efficiency stats, that we already know contribute to winning, to estimate the chances one team beats another.

As an example, let’s compute the probability each opponent will win the upcoming Super Bowl based on offensive rushing efficiency alone. Based on the regular season game outcomes from 2002-2007, the regression output tells us that the intercept is zero and the coefficient of rushing efficiency is 0.25. The model can be written:

Log(odds ratio) = 0 + 0.25*(ARI off run eff) – 0.25*(PIT off run eff)
= 0.25*(3.46) – 0.25*(3.67)
= -0.052

The odds ratio, would be e^-0.052 = 0.95. In other words, based on offensive running alone, the odds Arizona wins would be 0.95 to 1. In probability terms, this is 0.49, giving Pittsburgh the slightest edge. Another way of saying this is, holding all other factors equal, Pittsburgh’s advantage in rushing efficiency gives them just a 51% chance of winning.

[Note: You can translate odds ratios into probabilities by using prob = odds/(1+odds).]

Now we can do the same thing, but with the full list of predictor variables. The independent “input” variables are the efficiency stats for each team, and the dependent variable is the dichotomous outcome of each game—either 1 for a win or 0 for a loss. My handy regression software tells us that the model coefficients come out as:

Coefficient	Value
Constant	-0.36
Home Field	0.72
O Pass	0.46
O Run	0.25
O Int	-19.4
O Fum	-19.4
D Pass	-0.62
D Run	-0.25
Pen Rate	-1.53

The “logit,” or the change in the log of the odds ratio, can be written as:

Logit = const + home field + Team A logit - Team B logit

Logit = -0.36 + 0.72 + 0.46*(team A off pass eff) + 0.25*(team A off run eff) +...
- 0.46*(team B off pass eff) – 0.25*(team B off pass eff) - …

We have the constant, the home field advantage adjustment, and the sum of the products of each team’s coefficients and stats. The equation will eventually tell us Team A’s odds of winning, so we add its component logit and we subtract Team B’s. If Team A is the home team, we add the home field adjustment (0.72 * 1). If not, we can leave it out (0.72 * 0).

Now let’s look at Arizona and Pittsburgh in terms of their probability of winning Super Bowl XLIII. I’ll compute both teams’ logit component, combine them in the overall logit equation, then convert it to probabilities. To keep things simple, I’m going to only use team stats from the regular season for this example.

Arizona’s logit component would be:

Logit(ARI) = 0.46*7.1 + 0.25*3.5 – 19.4*0.024 – 19.4*0.028 – 0.62*6.5 – 0.25*4.0 – 1.53*0.39
= -2.45

Pittsburgh’s logit component would be:

Logit(PIT) = 0.46*6.0 + 0.25*3.7 – 19.4*0.030 – 19.4*0.026 – 0.62*4.3 – 0.25*3.3 – 1.53*0.41
= -1.51

Because the Super Bowl is at a neutral site, I’ll only add half of the home field adjustment when I combine the full equation.

Logit = -0.36 + 0.72/2 - 2.45 + 1.51
= -0.93

Therefore the odds ratio is e^-0.93 = 0.39. That makes the probability of Arizona beating Pittsburgh at a neutral site equal to 0.39/(1+0.39) = 0.28. Pittsburgh’s corresponding probability would be 0.72.

(Notice how the constant and the home field adjustment cancels out to zero for a neutral site.)

In part 2 of this article, I'll explain how I factor in opponent adjustments and how I calculate a team's generic win probability (GWP)--the probability a team would win against a league-average opponent at a neutral site.