How We Rank Teams
Rankings History and Philosophy
Our proprietary power rankings are based on algorithms originally developed by Mike Greenfield, who launched the first incarnation of TeamRankings.com back in 2000. Mike initially focused on college football and NCAA basketball, then expanded the site to include several other sports including NFL, NBA, NHL, and MLB.
Our power rankings, and the quantitative ratings that drive them, are objective and data driven. They incorporate only hard facts about teams, game outcomes, and game situations. Although we have evolved our algorithms over the years, we’ve stayed true to one philosophical belief. Measurable data and game performance is gospel, no matter what the "experts" think and say about teams. Humans have an incredible ability to rationalize any position or opinion they want to believe, even when sufficient data exists to refute it.
Imagine two players walk into a gym to shoot hoops. First you watch their technique as they warm up. Player A has an athletic build, an NBA-looking shooting stroke, and great elevation on his jumper. Player B is slightly overweight, undersized for his position, and shoots the ball with a low, straight, seemingly physics-defying trajectory.
Then you point to a spot on the floor and ask them each to take 10 shots. Player A hits four; Player B hits nine. So you point to another spot and tell them to take 50 more. Player A hits 24; Player B drains 43. Although the evidence is mounting that Player B is actually the better hoopster of the two, many basketball "experts" would struggle to believe it, even if measurable performance results continued to indicate that the odds that Player B was just having a lucky day were very low.
We don’t give a hoot what Player B looks like, what school he went to, or that the scouts dis him. We’re saying he’s better as soon as the performance data indicates with reasonable confidence that he is.
What Makes Our Rankings Different (and Better)
Such human (ill)logic, along with other biases (regional, conference, etc.), pollute many contemporary ranking systems such as the AP and Coaches Polls. At the same time, popular computer based ranking systems such as the RPI are often guilty of using overly simplistic or downright silly math. As we like to say, a computer is only as smart as the person programming it.
Our rankings combine smart math and 100% objective data. Make no bones about it, we’re certainly not the only folks in the world who’ve built sophisticated, unbiased, and effective sports rankings systems. However, doing it "right" takes a computational skill set that the vast majority of people don’t possess. Out of the people that do, some work for odds makers and sports books; some are professional handicappers or crunch numbers for sports betting syndicates; some do analysis just for fun, primarily to succeed at casual betting or friendly competitions; and a handful, including us, Jeff Sagarin, Ken Massey, and Ken Pomeroy, have chosen to publish their work for public consumption.
Perhaps most importantly, we have a track record. Since 2000, we’ve tested various applications of our rankings, such as predicting the winners of games and projecting how teams will perform in the NCAA tournament. Although performance always varies by sport and by season, nearly every year our methods outperform benchmarks including "expert" opinions and crowd wisdom at picking game winners, and provide the foundation for profitable sports competition and wagering strategies with measured risk.
How Our Rankings Work
The central idea behind our power rankings is to define numerical ratings for every team in a league such that together, all of the ratings "make sense." Got it? Great, we’re done. Thanks for your time.
If you’d like some more explanation, however, here’s an example. We have developed formulae which, given two teams, their respective numerical ratings, and the location of the game, will compute the odds of each team winning that game.
Just for kicks, let’s randomly assign the Stanford basketball team a 40.2 rating.
Given our formulae, a numerical rating of 40.2 implies a certain number of wins, based on Stanford’s rating, the ratings of their opponents (i.e. schedule strength), and where the games are played. (Margin of victory performance plays a role as well.) Let’s imagine that according to our model, a 40.2 rating for Stanford implies 6.1 expected wins out of their 10 games played so far.
If in reality, Stanford is 7-3, then its numerical rating needs to improve; a 40.2 rating is too low. If Stanford is only 5-5, its rating needs to fall. If Stanford is 6-4, we’ve got them pegged pretty well.
Coming up with numerical ratings for all teams that "make sense" when taken together, especially when there are dozens or hundreds of teams in a league, is consequently a very complex and iterative process. In order for our rankings to be "correct", each team’s expected number of wins, based on implied win probabilities for every game played, must equal its actual number of wins.
This equilibrium is achieved by looking at past statistics, using a variety of mathematical and statistical functions, testing a hypothetical set of ratings, and iterating like mad. It could take weeks, months or years for even a skilled mathematician to do these calculations manually, but today’s computers can solve the puzzle in minutes.