|ESPN.com: World Cup 2010||[Print without images]|
When we were in the development stages of the Soccer Power Index, wrestling with how to make the system as informative as possible, I was asked a seemingly simple question by one of ESPN's soccer writers that I had an incredibly difficult time answering. "Why rate the teams at all?" he asked. "Isn't that what they play the games for?"
Some of this, ultimately, is a cultural difference. We Americans -- we like to rate things, whether it's the schools we send our children to, the food we eat or the teams we root for. It's in our DNA. Other cultures sometimes take a bit more of a holistic approach toward things.
It was a good question, though. Somebody is going to win the World Cup in South Africa next year. Perhaps it will be a team such as Brazil or Spain, which would surprise approximately no one. But it also could be a dark horse -- Chile or the Ivory Coast or Serbia, or the United States -- this is a fairly deep field. Whichever team prevails will have played tremendous soccer and earned every karat of gold on the FIFA World Cup Trophy.
As Herb Brooks said to the United States hockey team before the "Miracle on Ice" against the Soviet Union: "If we played 'em 10 times, they might win nine -- but not this game, not tonight." Brooks was undoubtedly right: If the U.S. and the USSR had played a 10-game series, the Americans would have been lucky to win once. But it was the once that counted -- and the abstraction of whether the USSR was the "better" team, maybe rated higher by a computer formula, didn't matter.
So then: Why rate the teams at all? Well, from my perspective, we do it not because we're interested in the past, but because we're interested in the future. The SPI ratings are intended to be forward-looking. They're intended to be predictive; every variable in the SPI has been tuned to give you the best possible objective and statistical forecast of how a team will perform in South Africa. This concept might differ somewhat from a retrospective or backward-looking ratings system. The SPI ratings are not trying to reward or punish teams based on their past results. Rather, they are trying to predict which teams will have the most success going forward.
Of course, this is easier said than done. While 19 of the first 20 teams in the SPI entered November either having qualified or with a shot to qualify for the World Cup, the challenge in preparing an international soccer ratings system is that there is relatively little reliable data to go by, as compared with other sports. If a particular international team is not engaged in a major competition, such as the World Cup, it might play only a handful of meaningful matches each year. Compare that to a 162-game season in baseball, an 82-game season in basketball or hockey, or a 16-game season in American pro football.
Many of these games, moreover, might be against teams of inferior quality, or they might feature marginal lineups as many of a team's star players are engaged in club competition and have not returned home. For that reason, it is important to be somewhat expansive about the amount of data we use. Things such as margin of victory and home-field advantage, which are ignored by some other ratings systems, play a fairly large role in SPI. More distinctively, SPI blends ratings from club competition with those from international play, providing for a more robust assessment of the level of talent on a particular team.
Although I would encourage you to read the much longer and more formal article on SPI's methodology in addition to this one, let's talk for a moment about how some of these things play out in practice.
• Goal differential: Think goal differential doesn't matter in soccer? Tell that to Mobutu Sese Seko, the former dictator of Zaire, who threatened his nation's players in the 1974 World Cup, telling them they wouldn't be allowed to return home if they lost to Brazil 4-0. Thankfully, the Zaireans held the Brazilians to just three goals.
More practically speaking, goal differential does matter in a variety of ways in international soccer. As the first tiebreaker, for instance, it often determines who advances to the knockout stage of the World Cup. But that's not why we include it; we include it because of its predictive power. There's just a lot of information you're throwing away if you don't look at the scoring margin, especially given the disparities in the quality of competition. It's virtually impossible to beat Brazil at home: Does a team that manages a competitive 2-1 defeat really deserve to be treated the same as one that drops the match 7-0? If a team barely holds on against the Faeroe Islands or San Marino in a European qualifier, would that really give you confidence about its likelihood of success against tougher competition?
What SPI doesn't do is reward teams for running up the score against poor competition. Australia's 31-0 win against the hapless American Samoa in 2001, for instance, is treated as no more than about a 4-1 win against normal competition. Once the adjustment is made for quality of competition, however, we do give teams credit for each additional goal they score or allow.
A good example of a team for which this matters is Uruguay, which had the third-best goal differential in South American qualifying. Uruguay had won a handful of blowout matches but lost some close ones. Perhaps this was an example of a team that couldn't perform in the clutch? Well, not really: When it needed critical wins against Ecuador and Colombia, Uruguay got them, and the team is now the heavy favorite to defeat Costa Rica in the home-and-home playoff and become the fifth South American team in the World Cup.
• Home-field advantage: Home field is tremendously important in international soccer -- it's worth the equivalent of about 0.6 goals in a sport in which the average team scores 1.4. That's the equivalent of home-field advantage being worth about 8 points in the NFL, a sport in which the average team scores 20 points a game (in actuality, home-field advantage is worth only a field goal in the NFL.) Yet some other rating systems ignore it. That would be fine if home and away games balanced out, but they don't always: Switzerland, for instance, played a bunch of extra home games in 2008 because it co-hosted the European Championships. Wealthy teams such as the United States might play the vast majority of their friendlies at home, but poorer ones such as Ivory Coast are almost always on the road (this is one big reason we have Ivory Coast rated higher than those other systems.)
• Competitiveness coefficients: International soccer is unusual in that lineups rotate frequently from game to game. Sometimes this depends on the whims of the coach ("cough" Diego Maradona "cough"), but more than that, it's a simple calculation of how much is at stake. It's as though, on some occasions, the New York Yankees are the New York Yankees and, on others, the Yankees have been replaced by the team's Triple-A affiliate, the Scranton-Wilkes Barre Yankees.
Take, for example, Mexico's 5-0 thrashing of the United States on July 27 in the finals of the Gold Cup. You might think that, this being the finals of a "major" international competition, it would be taken relatively seriously. But that's not how either of the teams was treating it. The United States was resting its starters after they'd returned from their successful performance in the Confederations' Cup. Mexico had, perhaps, two or three players in the lineup who are likely to see substantial playing time in South Africa next year. It was treated as a developmental game by both teams.
The way we work around this is through a weighting system we call competitiveness coefficients, which basically measure, on a scale from 0 to 1, the quality of the lineup each team is using. If the players in the lineup that day are the same ones who are playing in matches that we know to be important -- key World Cup qualifiers, the Confederations Cup, the European Championships, the World Cup itself -- the game will be weighted highly. If the B-teams are in there, it might barely be weighted at all. And that's the case, indeed, with the Gold Cup final -- it receives the minimum weight of 0.01, or about 1/100th of the weighting a World Cup match that is vital for both teams might receive. The match simply doesn't have any real predictive power, any more than a game between the Yankees' and Red Sox's farm teams would. And of course, this is also true of most (although not all) international friendlies, which SPI weights less than other systems do.
• Separate ratings for offense and defense: Certainly, the two things cannot be separated entirely in soccer -- it is too fluid a sport. But one of the things that makes soccer distinctive is that teams play at entirely different paces -- take the conservative Swedes, for instance, against a much more up-tempo team such as Mexico or Ivory Coast. And, of course, some teams have concentrations of their better players on their attack or on their defense. For this reason, we provide separate ratings for offense and defense. These are listed on the team page, accessed by clicking on the team name. The offensive and defensive ratings are calculated separately before being combined to produce an overall grade. And it turns out the difference is not entirely cosmetic because different types of teams match up differently against one another. We have found, for instance, in looking at thousands of games played since 1998, that defense-minded teams hold up better against tough competition but are more vulnerable to an upset against weaker sides. This is especially true in games, such as those in the World Cup knockout stage, that can go to a shootout, where goalkeeping skill is crucial. Because almost all of the teams in South Africa next year are tough, this means that defense-oriented teams such as Italy are somewhat stronger than their SPI suggests and that more freewheeling teams such as Serbia are more vulnerable. (Indeed, this is part of why the defense-first Italians have tended to have so much success in the World Cup).• Club competition: Probably the most controversial part of the SPI is that it uses data from club competitions as well as from international play. The way it does this is complicated, and the longer explanation is reserved for the methodology piece. But basically, we've taken results from every recent game in the four key European leagues (England, Germany, Italy and Spain), plus the Champions League, and assigned credit or blame to the individual players on the pitch based on the results of those matches. If Samuel Eto'o scores a goal for Inter Milan, Cameroon also will get a little bit of credit in its SPI. If Petr Cech has a clean sheet for Chelsea, that will improve the Czech Republic's ratings a little bit. And so forth.
We have designed the club-based ratings very carefully; soccer is more a team sport than an individual one, so both team performance and the performance of the players as individuals are evaluated as part of the system. We also have taken care to make sure no team is advantaged or disadvantaged merely by having players who happen to participate in the "big four" leagues. A player doesn't get credit merely for playing in, say, La Liga. He gets credit if he and his team play well, and might lose points for his international team if he doesn't.
Nevertheless, when used the right way, this information provides for a much deeper portrait of a team than you might get from other ratings systems. For instance, we can better differentiate a team, such as France or Argentina, that is merely underachieving and liable to bounce back before South Africa from one, such as Paraguay or Australia, that has few international stars and probably has been playing over its head. Again, the goal of the SPI is to be predictive -- and that means looking not just at the results but also at the talent.
This is not an exhaustive list of the things that make SPI unique. But certainly, even with all the information it does account for, SPI is not going to be perfect. Soccer is a rich, wonderful, and unpredictable sport, and indeed it would be quite a shame if a single number could tell us everything we needed to know about a soccer team. Fortunately, the SPI ratings do not. They merely reflect the relatively limited statistical information that is available in international soccer, and do so in a way that is as fair, accurate and predictive as possible. In other words, SPI is designed to serve as a general guideline. I'm sure it will start a few debates -- but I don't expect it to settle them.
Nate Silver is a renowned statistical analyst who was named one of "The World's 100 Most Influential People" by Time Magazine in 2009. He gained acclaim for outperforming the polls in the 2008 U.S. presidential elections and created baseball's popular predictive system, PECOTA.