Expected points and EPA explained
Looking at how ESPN's new play-by-play metric works
Let's start with what seems like a pretty basic question: Which team has the best offense in the NFL? How about the best defense? Special teams?
Although opinions on these matters can vary wildly, the objective statistical measure of these aspects of football has usually been something that relies on yards. The league's offenses are ranked by which teams averaged the most yards per game; the defenses are judged by who allowed the fewest yards per game.
But, as most experienced football fans know, yards can be extremely deceiving. For example, if a team has impressive yardage totals but turns the ball over frequently or fails to convert in the red zone consistently, what use are all those yards in terms of actual points on the scoreboard? (Eagles fans might understand this plight better than anyone.)
And not all yards are created equal. Would you rather have 8 yards on third-and-10 or 3 yards on third-and-2?
Finally, the team with greater yards from scrimmage has won less than 70 percent of games over the past five seasons. So clearly, winning the yardage battle is not the same as winning the game in the NFL, as other factors not measured in traditional yardage stats play a large role, as well.
One could make an argument that offenses and defenses should be judged on points scored and allowed. But that doesn't account for defensive or special-teams touchdowns, not to mention defenses consistently giving their offenses shorter fields and therefore increasing their chances of scoring. An extreme case of this is the Jets-Ravens game from 2011, when the teams combined for 51 points but only one of the six touchdowns was scored by the offense. The Jets' defense played phenomenally in that game, and it would be inaccurate to judge it by its "34 points allowed."
But don't fret, NFL fans, there is a statistical solution to this problem: It's called "expected points." Although it does rely on some advanced math, the benefits of using this framework make it more than worth it to understand up front.
Based on statistical analysis of 10 years of NFL play-by-play data, ESPN has created a formula that assigns an "expected points" value to the team with the ball at the start of each play based on the game situation. Expected points (EP) accounts for factors such as down, distance to go, field position, home-field advantage and time remaining.
The value it puts out is on a scale from about minus-3 to 7, and it basically represents "which team is likely to score next, and how many points?" It represents the likely points not just on the current drive but also on the next drive or any subsequent drive until the score changes or the half ends. A lower value indicates a more favorable situation for the defense (i.e. fourth-and-20 from your own 1-yard line could be close to minus-3 EP), and a higher value represents a more favorable situation for the offense (i.e. first-and-goal is generally worth 6 EP).
Without going into technical details, the key is that the relationships in the EP formula encapsulate the basic tenets of football, including:
• Being closer to the opposing goal line and farther from your own is better
• Earlier downs are better (first-and-10 is better than second-and-10, etc.)
• Shorter distance to go is better
• Being at home is better
(If most of this sounds pretty fundamental and obvious to you, that's a good thing. The point here is that the math is consistent with how the game works.)
The benefit of having this EP value at the start of each play is that it can be used to measure the success of that play by comparing it to the EP value at the start of the next play. Good offensive plays such as first downs generally increase EP; losses or incomplete passes generally decrease it. This difference in EP from one play to the next is called expected points added (EPA). Because of all it accounts for and its points scale, EPA is a very accurate measure of how each play affects potential changes on the scoreboard.
The EP/EPA concept was first popularized in the Hidden Game of Football in the 1980s and more recently was advanced by Brian Burke at AdvancedNFLStats.com. And ESPN has already been using EPA for more than a year, as that framework serves as the foundation for Total QBR.
To make the concept more tangible, here are some examples:
• From your own 20-yard line, an 8-yard gain on third-and-10 is worth about minus-0.2 EPA because you don't get a first down; the same 8 yards on third-and-7 is worth 1.4 EPA for converting a long third down and keeping the drive alive. EPA knows that not all yards are created equal.
• A turnover on first-and-10 at midfield that is taken back to your own 20 is worth minus-5.5 EPA; a Hail Mary interception at the end of the half from midfield is not nearly as penalizing. EPA knows that all turnovers aren't created equal, as well.
• A 60-yard pass play down to the 1-yard line on third-and-10 is worth 5.7 EPA because it puts you right on the doorstep of scoring. The subsequent 1-yard rushing TD on first-and-goal is worth much less, even though that's the play that actually gets you the six points. Think about which play is more valuable to the offense (not in terms of fantasy football).
Because of its play-by-play nature, EPA can be divided to look at pass EPA or rush EPA for offenses and defenses. Looking at EPA on a per-play or per-drive basis can tell you which units have been the most efficient given the opportunities they've had. EPA can even be used to evaluate the hidden contributions special-teams units make to the scoreboard.
The overall point here is that EPA is the granular, play-by-play version of what wins the game. The team that has more points wins 100 percent of all games, but points don't change with every play in the game. Yards, first downs and turnovers are significant and do change somewhat on a play-by-play basis, but those are crude subdivisions of points that aren't even on the same scale and don't always correlate with winning.
EPA accounts for all of those events and ends up matching the score at the end of the game. Because EPA also changes on a play-by-play basis, it is the correct way to split up those points on the scoreboard. If you want something that, at the end of the game, is as good as the score at saying who wins but that also changes with every play in the game, use EPA.
ESPN will be using EPA more throughout the season for a variety of NFL-related analysis. This versatile framework will allow us to break down football statistically in more accurate and nuanced ways than ever before, bringing better insights to all football fans.