March 17, 2012 - by Gregory Matthews
***NOTE*** This is an entry from the Sweet 16 of our inaugural Stat Geek Idol contest. The opinions or predictions expressed below do not represent the views of TeamRankings.com, and are solely those of the author. This article was conceived of and written by Gregory Matthews of Stats In The Wild.
In the past, I’ve proposed the “Cinderella plot” to graphically display the “madness” of every college basketball fan’s favorite month, which I have updated through this evening.
Following the chart, I’ll introduce a new graphic, and a way to measure the madness of March.
It should be clear (if my graph was successful) that 2011 was the maddest of the marches since 2001. An 8 seed playing an 11 seed in the Final Four is quite unexpected.
But just exactly how much crazier was 2011 than some other years like 2006 when George Mason as an 11 seed went to the Final Four or 2002 when we saw a 10 seed (Kent State) and a 12 seed (Missouri) in the Elite Eight. And where do the shocking events of the first round in 2012 stack up against other tournaments? What we need is a way to quantify the madness.
After some reflection on what the meaning of “madness” is, I developed the “madness coefficient” to quantifying the unexpectedness of a tournament. (Some details are of how to determine the “madness” can be found at the end of the article.) I then extended this measure to assess madness of each round of the tournament, resulting in this fine balloon plot.
2011 was the “maddest” year overall since 1990, however, it didn’t start out that way. A fairly boring first round was followed by a not that out of the ordinary second round. Then things got crazy. The Elite Eight featured only one 1 seed along with a 5, 8 and 11 seed, which led to an 8 vs 11 match-up in the Final Four.
2001 had the craziest first round since 1990 (until tonight) in which fans witnessed all four 9 seeds win, the 10, 11, 12, and 13 seeds each won two of their four games, and Hampton, a 15 seed, knocked off Iowa State. 1991, another great opening round, saw every seed 9-15 win at least one game and featured the rare second round 10 vs 15 match-up (which we will be treated to this year courtesy of Xavier and Lehigh). In contrast, 2000 featured only 3 first round upsets total, two by 10 seeds and one by an 11 seed.
And then there is this 2012. It’s hard to wrap my brain around what happened today.
On the first day of the tournament there were only two lower seeds that won a game (Colorado (11) and VCU (12)) and it looked like we might be headed for a pretty boring year. Then today happened and eight of the games were won the the lower seed including an 11, 12, 13 and TWO 15 seeds. (Of course, I’ve been hyping up 14 seeds all week and they are the only seed, excluding 16, not to win a game.)
This has clearly been the “maddest” first round in the tournament since 1990 and my measure reflects that. Finally, I’ve projected the madness score forward based on the higher seed winning all remaining games, which for the sake of madness I hope doesn’t happen. because what I’d really like to see is Lehigh playing Norfolk State with a berth in the National Championship on the line. That would be madness!
First, I needed to come up with a way to measure the madness of an individual tournament. I calculated a “game madness” score by taking the squared difference between the seed that actually won and the seed that is “supposed” to make it that point in the tournament. For example, when a 1 seed wins, as they are supposed to the “game madness” score is (1-1)^2=0. There is no madness when a 1 seed advances. As an example of what happens in an upset, in 2011 when Butler, an 8 seed, beat number 1 seed Pittsburgh, the difference between the seed that actually won (8) and the seed that should have won (1) is 7, which squared is 49, the “game madness” score.
I calculated this quantity for every game in a tournament and then weighted all of the “game madness” scores according to what round they occurred in with heavier weights assigned for games later in the tournament. So for instance, in 1997 when Coppin State as a 15 seed beat South Carolina the game madness was (15-2)^2=169, but this received low weight because it occurred early in the tournament. Alternatively, when Butler beat VCU in the Final Four in 2011, the “game madness” score was again (8-1)^2=49 (In reality they beat an 11 seed, but there was “supposed” to be a 1 seed there), but this game will receive a much larger weight because it occurred in the Final Four.
Then, to come up with a madness score for the overall tournament, I simply add up all of the individual game madness scores weighted by the round they occurred in to come up with a score. This leaves me with a raw madness score for a tournament.
Next, I simulated the NCAA tournament 10,000 times, each time calculating a tournament madness score. Then I took an actual tournament madness score and compared it to all of the simulated madness scores and defined the “madness coefficient” to be the percentage of simulated tournament madness scores were smaller than the actual tournament madness score for a given year. So for instance, the 2011 tournament madness score was larger than a little over 8000 of the 10000 simulated tournaments.
Printed from TeamRankings.com - © 2005-2020 Team Rankings, LLC. All Rights Reserved.