Win Probability Graphs

Check out the Win Probability graphs and play-by-play for the current season or for previous seasons. Or check out your favorite team's biggest comebacks and most exciting games. Browse the archive for previous years or the current season by week.

Advanced Stats for Players and Teams

Browse innovative advanced stats for offensive skill players, defenders and teams, including Win Probability Added, Expected Points Added, and Success Rate.

Community Research

Interested in publishing your own football research, analysis, or stat-based commentary? Advanced NFL Stats Community is the site to share your thoughts and ideas. There's plenty of data available to get started.

Jan 30, 2010

Expected Points (EP) and Expected Points Added (EPA) Explained

This post will explain the concepts of Expected Points and Expected Points Added. In future posts when I refer to these stats, I'll link here.

Football is a sport of strategy and decision making. But before we can compare the potential risks and rewards of various options, we need to be able to properly measure the value of possible outcomes.

The value of a football play has traditionally been measured in yards gained. Unfortunately, yards is a flawed measure because not all yards are equal. For example, a 4-yard gain on 3rd down and 3 is much more valuable than a 4-yard gain on 3rd and 8. Any measure of success must consider the down and distance situation.

Field position is also an important consideration. Yards gained near the goal line are tougher to come by and are more valuable than yards gained at midfield. Yards lost near one’s own goal line can be more costly as well.

We can measure the values of situations and, by extension, the outcomes of plays by establishing an equivalence in terms of points. To do this we can start by looking back through recent NFL history at the ‘next points scored’ for all plays. For example, if we look at all 1st and 10s from an offense’ own 20-yard line, the team on offense will score next slightly more often than its opponent. If we add up all the ‘next points’ scored for and against the offense’s team, whether on the current drive or subsequent drives, we can estimate the net point advantage an offense can expect for any football situation. For a 1st and 10 at an offense’s own 20, it’s +0.4 net points, and at the opponent’s 20, it’s +4.0 net points. These net point values are called Expected Points (EP), and every down-distance-field position situation has a corresponding EP value.

Suppose the offense has a 1st and 10 at midfield. This situation is worth +2.0 EP. A 5-yard gain would set up a 2nd and 5 from the 45, which corresponds to a +2.1 EP. Therefore, that 5-yard gain in that particular situation represents a +0.1 gain in EP. This gain is called Expected Points Added (EPA). Likewise, a 5-yard loss on 1st down at midfield would create a 2nd and 15 from the offense’s own 45. That situation is worth +1.2 EP, representing a net difference of -0.8 EPA.

We can value turnovers in the same way. Suppose that on 2nd and 5 at the opponent’s 45 there was a fumble recovered by the defense. The 2nd and 5 was worth +2.2 EP, but not the opponent has a 1st and 10 on their own 45, worth +2.1 EP to them. The result of the play represents -2.1 EP for the original offense for a net loss of -4.3 EP. On average, a fumble in that situation means a net expected loss of a little more than 4 points.

To be of good use for most kinds of analysis, we need the measure of success to be linear. In this case, that means that a +2 EP is exactly twice as good as +1 EP, that +4 EP is twice as good as +2 EP, and so on. We need linearity when we analyze decisions. For example, what would you rather have a 100% chance of +3 EP, or would you rather have a 60% chance at +6 EP with a 40% chance of 0 EP? To answer this question definitively, each net point of advantage must be equally valuable to a team.

But there’s a problem. We all know that being up by 1 point at the end of a game is just as good as being up by 50. Not all points are equally valuable. Teams well ahead will sacrifice point advantages in exchange for running time off the clock, which in the end helps them win.

To mitigate that problem, the baseline EP values for each down-distance-field position situation must be created based on real game situations when points are equally valuable and time is not yet a factor. The baseline EP values are therefore based only on game situations when the score was within 10 points and in the first and third quarters. This eliminates situations like ‘trash time,’ and other distortions.

EP and EPA have a variety of applications. We can use EP to measure and compare the relative value of runs vs. passes in various situations. We can tally up the EPA for individual players and for teams for a more accurate valuation than what traditional stats can tell us. Perhaps the most useful application of EP is in the analysis of fourth down decisions, which suggests teams should be going for it far more often.

For an example of what EP values look like on first down, see the chart below. For second and third downs, see this post.

10 comments:

Excalabur said...

I'm worried about your fits. Because for EPA you need the difference between two EP values, you're effectively taking a derivative, which is notoriously noisier than the data. The 'fit' curve above (and in your other articles, which I've read) is... messy. It looks to my eye that you could get away with a piecewise linear fit with two pieces: a "red zone" piece and the rest of the field; alternatively, a polynomial fit of some kind.

Anonymous said...

Brian,

2 things -

1. Doesn't using the historical average not take into account the strengths/weaknesses of the 2 teams playing? If I am a coach and am trying to decide whether to punt or go got it on 4th and 3 from the opponents 45 yard line, don't I need to adjust your numbers with how my team stacks up against the other team?

What I'm getting at is while having the ball on their 40 is worth 2.5 in general, against this team I think its worth 2.9 because of x, y, and z.

I don't dispute your numbers, I'm just wondering if for this to be of greater use to a coach, they need to adjust your overall numbers based on the quality of the 2 teams.


2 - might you at some point be able to post "Expected Points by 2md / 3rd/ 4th down field position"?



Thank you very much for your articles.
Brandon

Jeff Clarke said...

Brandon,

I've developed a similar (unpublished) model that does incorporate team strength into the equations. It does change conclusions on things like punt\go decisions occasionally, but nowhere near as often as many people think.

If you have an especially weak offense, you would think that would mean you should always punt and never go on fourth down. Thats certainly the conventional wisdom. Think about what a punt really represents. You are gambling that your defense will stop the other team and that you will ultimately score when you get the ball back. If your offense is extremely weak, you have to adjust for the fact that you are less likely to score on future posssessions as well as the fact that you're less likely to convert the fourth right now. When you make both adjustments, the two factors tend to cancel each other out. Occasionally you even see counter-intuitive results (a weak offense should go and a strong offense should punt).

Anonymous said...

I have a question about the "next points" concept. So I assume if a team is on offense, and on that drive they score a field goal, the next points would be 3 for all of those plays. However, if they score a touchdown, does that count for a 6 no matter what, a 7 no matter what, depending on what happens on the extra point try, or is it an average of how many points a touchdown is worth(which I'd guessing would be something a little lower then 7).

Brian Burke said...

Good question. In my implementation of the EP concept, a touchdown is worth 7 no matter if they score a 2-pt conversion or miss the XP.

Actually, it's 6.3 for all TDs, and it's 2.3 for all FGs. There has to be a kickoff to the other team after the score (except at the end of the half). The value of the kickoff is -0.7 because the value of a 1st down at a team's own 29 (or so) is 0.7 EP.

Mark Kamal said...

I like it...just one comment: I think you may be counting the value of a turnover slightly wrong. In your fumble example above, you state the value for the offense (let's call this Team A) is +2.2 EP, and when the fumble happens the defense (Team B) takes over with an EP of +2.1, so the net value is 4.3.

Your computation of 4.3 above actually assumes that Team B's next drive is worth 0 before the fumble happens. Their next drive actually has value - we can compute where we expect to start their next drive given where Team A currently has the ball. Let's say we calculate that we expect Team B to start their next drive at their own 29, for an EP of +.7. The value of the play would be:

(Team A drive value after - Team A drive value before) - (Team B drive value after - Team B drive value before) = (2.2-0)-(.7-2.1) = 2.2 - (-1.4) = 3.6.

Brian Burke said...

Hi Mark,
Interesting. But isn't the value of the opponent's next potential drive already baked into the present EP value of situation? The way I did EP, that +.7 EP value of a following drive is already accounted for in the present value of all possession situations. I know I'm not being clear, but does that make sense?

Anonymous said...

What about Matt Cassel's EPA, it's pretty awful, how would you explain him?

Mark Kamal said...

Hey Brian,

How are you accounting for future drives in the EP value? When setting up your system, what was your dependent variable (ie what did you regress against)?

If you regressed against (a) points on the drive, then my statement above makes sense. But you could have regressed against something else - for example (b) points on this drive minus points on opponent's next drive. In that case - yes, you've already accounted for it. Case b has some different properties - for example, scoring a TD wouldn't result in an EP of +7.0 - it would be more like 5.8.

Brian Burke said...

The way I did it is very similar to the Romer paper. It's next points scored, either on this drive, the following drive, or the next, and so on. It's the next points scored for either team, positive for the team with possession, and negative for the team currently on defense.

To account for follow-on drives beyond the next score, I factored in the value of the subsequent kickoff.

Post a Comment