Madness Strikes November: Introducing Our Brand New NCAA Bracket Predictions

posted in > Site Updates, NCAA Basketball, NCAA Tournament

A few days ago when we released our 2012 preseason college basketball top 25, we hinted that we had an exciting new feature that we were just itching to let loose into the wild. Well, here’s what we’re so amped about:

We are now simulating the entire college basketball season every single day, all the way from November through April. Our projections now include:

  • Conference tournaments
  • NCAA selection and seeding
  • The NCAA tournament itself

This means that every single day, you can count on TeamRankings to deliver intelligent, up-to-date, algorithmically derived odds for thousands of future college basketball outcomes, from Kentucky’s chance to land a 1 seed in the NCAA tournament to Grambling State’s odds to make the 2013 March Madness bracket.

There are dozens of postseason predictions and probability distributions for each and every one of the 347 Division I men’s basketball teams. A lot of these predictions, of course, relate to March Madness.

We’ve been working on this bad boy for months, and we’ve still got a few kinks to iron out. But there’s a whole lot of analytical firepower under the covers and we’re very excited about where we can go with this project.

Why Are These Bracket Predictions Different / Awesome / Better?

  • Most bracketologists operate under the following mantra: “If the season ended today, here’s how I think the NCAA bracket would look.” Maybe a few of them try to work in some rough projections of how the rest of the season could play out for a few key teams, but that’s rare. This whole approach is downright silly, especially during the first, oh, 90% of the season. Differences in remaining schedule strength, probable conference tournament seeds, and likely conference tournament opponents can make a HUGE difference in a team’s NCAA tournament selection and seeding fortunes. The season doesn’t end today. We simulate out everything that could still happen; other people don’t.
  • We then go one big step further and actually project each team’s numerical odds to make the NCAA tournament, and then to survive every successive round, with detailed probability distributions behind every projection. They tell you, “Kansas should make the tourney, and I have them as a 1 seed.” We now tell you, “Kansas has a 90% chance of making the 2013 NCAA tournament. While their most likely seed is a 1, the odds of that happening are actually only 22%, meaning that they are more likely to NOT get a 1 seed than to get a 1 seed. Overall, Kansas currently has a 4% chance of being 2013 NCAA champions, but those odds would increase to 11% if they are able to secure a 1 seed.” We actually tell you a lot more than that, but I’ll cut it off there for now.
  • We update every single projection mentioned above every day. Not every week. Not every few days. Every morning, based on the results of the previous day’s games, which factor into our power ratings. All the calculations are automated.

Interesting Stuff To Learn Here

The level of analysis we are doing will hopefully facilitate a much better popular understanding of the dynamics of things like season outcomes and NCAA tournament seeding. For example, most fans would probably assume that the probability distribution of a team’s expected NCAA tournament seed would look like a smooth bell curve, but that’s rarely the case.

Let’s say you think Gonzaga is most likely going to end up getting a 3 seed. From that point, most humans would probably reason that their second most likely seeding would be either a 2 or a 4, then after that maybe a 5 or a 6, or as low as a 7 if they finished out the season poorly.

Yet the model’s simulations don’t end up with a single smooth peak. It sees Gonzaga with a good chance of getting a 2 through 4 seed, but the next most likely outcome is a 7 or 8 seed.

We haven’t analyzed every projected outcome in depth yet, but such an effect may well reflect the difference between the Bulldogs winning their last couple games (and so also the WCC tourney) and faltering late. The team quality in the two cases is similar, but treatment by the committee may not be.

As the season plays out, we should be able to glean a lot of interesting insights from all this data.

A Quick 2013 NCAA Bracket Teaser

As of November 8, 2012, one day before the season starts, here’s our official projection of the 2013 NCAA Tournament bracket come March:

(click to enlarge)

[Quick update/note: We’re not worried about following the NCAA’s bracketing rules here. We know that, for example, Georgetown can’t play Cincinnati in the first round. Our goal here is to show expected seed lines for each team, and give an idea of the rough quality of opponent they might face in each round. Trying to predict actual bracket matchups at this point is, well, pointless.]

More on this bracket in a bit.

Where Can You Find The New Bracketology Projections?

Right now, we have two types of pages displaying this new info.

NCAA College Basketball Bracketology Summary Page

Currently accessible via the “Bracketology” link in the left green sidebar of our college basketball section, this page shows the following for every college basketball team:

  • Odds to make the NCAA tournament (including whether via automatic/at-large bid)
  • Average seed projection (for top teams, these will be skewed low due to the fact that averaging a bunch of other numbers with 1 will give you a number greater than 1)
  • Odds of receiving a #1, #2, #3, or #4 seed
  • Odds to win the NCAA tournament

Team Bracketology Pages

Found via the “Bracketology” link in the left gray pullout menu of any college basketball team page (e.g. Kansas, Syracuse, San Jose State), every team bracketology page is linked to from the master bracketology summary page. These let you drill down to more detailed info about a team’s bracketology projection, including:

  • Projected NCAA seed distribution
  • Odds to make the NCAA tournament based on final record (counting currently scheduled games only, so some early season tournaments won’t be accounted for yet)
  • Odds to advance to each round of the NCAA tournament

These pages are in a very rough “version 1” form right now, but we wanted to get them out and see what people thought. We intend to keep making them better over the course of the season, as we’ve got a lot more fun data on teams to show.

Projected 2013 NCAA Bracket, Updated Daily (Coming Soon)

You may notice there is one major thing missing from the new pages above — a single projected 2013 NCAA tournament bracket. Sure, all these projected odds are great, but you wanna see the most likely end result, right?

We’re working on it. We’re currently saving a new projected bracket to our database every day, and the next step is to get that info up on the site for all to see. We expected to have that ready for public consumption next week. To tide you over until then, we included our official preseason bracket above.

We acknowledge that there are a couple head scratchers in there right now — St. Louis as a #2 seed, in particular. As with any modeling project of similarly massive scale, there are a still a couple kinks we need to iron out in our logic, and that’s part of why the automated bracket page isn’t live on the site yet. However, as the season progresses, and teams actually play a few games, outliers like St. Louis right now should become much more rare.

How Do We Create Bracket Predictions?

We do the following 5,000 times, then report the results.

1. Simulate The Regular Season

Based on our team power ratings, we predict the outcome of every remaining game in the 2012-2013 Division I college basketball season. Early in the season, the simulations are based heavily on our college basketball preseason ratings. Later in the year, those ratings will become less important, and actual team performance will take precedence.

2. Seed & Play Out Conference Tournaments

Based on end-of-season win/loss records and conference standings that result from each season simulation, we create conference tournament seedings and brackets. As with our season simulation, we then semi-randomly pick winners for every conference tournament game based on team ratings, round by round, until we have all our conference tournament winners.

Accounting for the various formats of conference tournaments is actually a huge pain in the butt, and took a long time to track down all the appropriate data and get it right. There are still a few minor tweaks we need to make to handle some freak tournaments that re-seed teams after the first round, but it’s very close now.

The results of all these simulations are shown on our college basketball projected standings page, as well as on various team pages that are linked from the projected standings page.

3. Simulate NCAA Tournament Selection & Seeding

This is the new glitzy part that we’re super excited about.

We spent some time this summer developing a computer model to predict the decisions of the NCAA selection committee. [Technically it’s 2 models — one for selection and one for seeding — but they’re pretty similar.] We looked at how data points like RPI, record vs. the RPI Top 25, conference win percentage, record in last 10 games, schedule strength, and predictive power ratings could be combined to mimic the past selection and seeding results.

Our model certainly isn’t perfect, but we tested it on data from Selection Sunday 2012, and it would have placed in the top half of the bracket project. Of course, one season could be a stroke of luck, so we’re curious to see how we do this year.

But the main point is that these are fully automated algorithms, and even at their biggest disadvantage — when there are no more games to play in a season, and no “projecting the rest of the season” edge over humans — our automated logic still did better than most self-professed “bracketologists” last year at predicting team selection and seeding.

During every simulation, we keep track of a team’s selection resume, including all those nitty gritty details like projected RPI and record vs. the final projected RPI top 25, and we feed those resumes into the model. The model spits out projected selection and seeding odds, and then we semi-randomly seed the tournament using those odds. (We don’t just rank the teams in order of our projected odds; we add some randomness because we know that our model isn’t perfect, and that the committee can at times make some quirky decisions.)

Keep in mind that early in the season, a team’s selection resume is mostly based on the results of our simulation. So if we have a team rated too high or too low, their selection and seeding related odds will also be off. As the season progresses, though, more and more of a team’s simulated NCAA tournament resume will consist of things that have already happened, and less will of it will be dependent on our projections, and our selection and seeding projections should become more accurate.

4. Calculate NCAA Tournament Advancement Odds

Finally, the payout. After simulating the season and the selection committee, playing out the NCAA tournament is actually pretty simple in comparison. We start with the projected bracket and team ratings, apply a little math, and poof…we end up with odds to advance to every specific round, for each team.

Right now, the best place to view those odds is on each individual team’s bracketology page. For example, check out Michigan State. The chart at the bottom shows the Spartans’ odds to advance to every round of the tournament, and the table at bottom right shows how their odds to win the entire tournament changes based on what seed they get.

Coming Soon

This system is new, and like any new modeling project of this scale, it’s still a work in progress. We’re planning on adding more info to the site as the season goes along. At the top of our list right now:

  • An automated daily official bracket projection, in a pretty bracket format
  • Full NCAA seed odds tables
  • NCAA round-by-round advancement table

If you’ve got any suggestions, requests, or questions then please leave a comment in the discussion thread below. We’re definitely open to any bright ideas you might have for interesting data that we could pull out of these projections and display.

In the meantime, we’re going to keep pushing the envelope with college basketball and March Madness related predictive modeling, and work to make these initial pages and data presentations even better. Hopefully today’s announcement is just the beginning of an exciting new chapter in what’s come to be known as bracketology. Like Nate Silver did with election predictions, our goal here is to use data to trump the human pundits.

  • dukie fan

    Hey love the info. Is there is a specific statistic that you guys weigh very heavily on considering that michigan is a projected ten seed/ st louis two seed?

  • David Hess

    Right now, these are basically just based on combining last year’s power ratings along with info on returning starters, recruits, and transfers. We talk about Michigan some in our preseason rating post here:

    The basic idea is that they were not super highly rated last year, and only return an average amount of our value metric (which incorporates minutes, offensive efficiency, and usage rate).

    As for St. Louis … yeah, I am a bit surprised about that one. Sometimes the numbers tell us things that we don’t necessarily agree with. We’ll definitely be taking a closer look at the model and seeing if there is some strange situation that’s leading to that. After all, they are only 18th in our preseason ratings. Not sure why that would translate to a #2 seed.

  • Dave in Indy

    NC State as a 6th seed, for the ACC preseason pick to win the conference, seems like a glaring underseed. Tourney tested last year, and returning the bulk of the team, should give them a great shot at making the trip to Atlanta in April. Should be at least a low 2 seed/ high 3rd seed.

  • Sean

    This is awesome. And, will be more useful obviously in the middle portion of the season. Were you surprised with only 4 A-10 teams or did that seem right at this point of the season? I guess I was surprised with out Butler and VCU included while UMass was. Not to criticize UMASS, but by the system you have in place.

  • ThreeFortyFive

    I think there need to be some restrictions added to the seeding model to better mirror the selection committee’s human behavior. For instance, I think you need to take into consideration the conference. There have to be restrictions in place to decrease the likelihood of two conference teams from being matched up in the first or second round, to prevent the auto qualifier from a well respected midmajor (Ivy) from getting a 16 seed, etc.

  • Adam

    Is there something in the formula that accounts for Michigan not having to play a 6’4″ guy at the 4/5 this year?

  • David Hess

    Well, the Princeton issue does look a little off, and there is definitely room for improvement in the seeding algorithm, I think it’s one of those cases that should resolve itself by the end of the year.

    First, they *do* have the 5th-lowest preseason rating out of the projected conference winners, so it makes sense they’d be down on the 16 line right now. If that rating is too low, it will move up as actual season results start rolling in.

    Second, this bracket basically projects no conference tournament upsets. Some will definitely happen, and a couple of those would push worse teams into the bracket, moving Princeton up to the 15 seed line.

    In fact, if you look at Princeton’s individual bracketology page, you’ll see their most likely seed is #15, followed by #14, and THEN #16:


    As for the conference matchups in the early rounds — we’re not really worried about that. Our goal here isn’t to project the exact matchups, it’s just to get the seeding right. The matchups are more to give you an idea of the *quality* of opponent that a team might face in each round, not an exact opponent.

    There are *many* rules about where exactly teams can be seeded, and it would be a pretty complex task to account for all of those in a completely automated bracket. … And then, even after putting in that work, I’m sure we’d be wrong 90% of the time, as one small change in the seed order has a butterfly effect that can change matchups all over the bracket. So basically, I’m saying that I’d love to try to account for the bracketing rules, but I think in the end it wouldn’t be worth the time and effort.

  • David Hess

    That one is mostly a result of our preseason ratings being down on NC State compared to the national consensus. Last year we did a pretty darn good job of identifying overrated teams in the preseason (see the “Overrated” section here, where we look at last year’s results: so I’m not overly worried about that one being off. If NC State actually plays well in the first half of the year, that will correct itself by March.

    By the way the “tourney tested” idea is one of the biggest reason we think they’re being overrated by the poll voters. Check out the link I posted here — it has an explanation for why NC State is rated lower than you might expect.

  • David Hess

    I think four A-10 does sound a little low, but if we made a “Last 4 Out” list, I’m pretty sure VCU would be on it. U Mass is #46 in our preseason ratings, VCU is #49, Butler is #59, and LaSalle is #60. So they could very easily have had 7 teams in the bracket if things were just a little different in our ratings.

  • David Hess

    Ha, no, not specifically. :)

    However, looking at it from a modeling perspective (that is, ignoring Michigan’s particular details and just focusing on the generic case) … if they played a 6’4″ guy there last year, and now he’s gone, then they will likely get a height upgrade, but the replacement is probably either A) somebody who was not good enough to earn the spot last year, or B) a freshman. Neither of those would lead the model to assume the team will make a big jump forward.

    That said, I do think we have Michigan rated too low right now. I also don’t think they are a top 5 team. Their true value is probably somewhere in the middle.

  • constitutionforever

    Who made these projections, Larry, Curly, or Moe? By the way you have Virginia and UNC possibly meeting in the second round. Two teams from the same conference are not allowed to meet until the Elite Eight. Back to the drawing board.

  • David Hess

    Yeah, as I told a commenter below:

    “As for the conference matchups in the early rounds — we’re not really worried about that. Our goal here isn’t to project the exact matchups, it’s just to get the seeding right. The matchups are more to give you an idea of the *quality* of opponent that a team might face in each round, not an exact opponent.

    There are *many* rules about where exactly teams can be seeded, and it would be a pretty complex task to account for all of those in a completely automated bracket. … And then, even after putting in that work, I’m sure we’d be wrong 90% of the time, as one small change in the seed order has a butterfly effect that can change matchups all over the bracket. So basically, I’m saying that I’d love to try to account for the bracketing rules, but I think in the end it wouldn’t be worth the time and effort.”

  • John Tatum

    How was VCU excluded from the projected tournament field?

  • David Hess

    Did you read “How We Create Bracket Predictions” above? We went through that process, and VCU didn’t make it. Which essentially means our preseason rating for them was low enough that they were on the bubble. If you check out the VCU page, we have them with about a 45% to make the tourney, right on the edge of being in:

  • Give New Jersey Power

    You need to redo Maryland with Dez, No way are they as crappy as you project but time will tell.

  • David Hess

    Sorry to burst your bubble, but we already included Dez Wells on Maryland’s squad.

  • RazorSheldon

    St. Louis as a #2 seed? Not happening. The only mid-major #1 or #2 seeds were in crappy conferences (see Memphis or San Diego State) and had 2 or 3 losses all season. St. Louis will have at least 5 or 6 losses this year…

  • David Hess

    Yeah, personally I agree with you. And actually, after the first day of games, St. Louis has already slid down to about the #4 seed line.

  • texnole

    Gotta admit that at first I was a little perplexed by you not having #25 preseason ranked Florida State in your bracket field. After their opening day loss ot South Alabama, it now seems valid.

  • David Hess

    Ha, while it’s always nice when a particular game result makes our predictions look smart, I wouldn’t read too much into a single loss. It’s still *way* too early to judge our bracket projection, good or bad. :)

  • Brent Rogus

    Thanks guys! Love the page!
    The only thing I would quibble with is that the sorting is done by chances of a #1 seed. It would seem more reasonable to me to sort by the probability of a tourney bid or the probability of winning the NCAA tourney rather than P(1 seed).

  • David Hess

    Thanks for the suggestion. Probably not a bad idea. However, we’re planning on updating the page to allow sorting by any column, so until we get time to do that we’ll probably just leave it as is.

  • 808Zag

    This is awesome. One of the most comprehensive sites I have seen in a while with a process that makes sense and still has room for improvement. Yes it could do with a little tweaking, but like you said above, we have 90% of the season left, so I think it is alright as long as it gets updated by crunch time.

  • David Hess

    Thanks, I’m glad you like it! That’s about how I feel about it — I’m just excited we have it up and running, so we can finally make little tweaks to the logic and immediately see the results.

    It’s been fun tracking the effects of big wins and losses on team’s seeds.

  • 3blind

    This is probably the worst site I’ve seen. Less than50% by far vs the spread.


    Um…sure, whatever you say. Care to share how you came to that “less than 50% by far” number?

  • 3blind

    If you take juice into consideration and if you’re consistent with the 3, 2, 1, star plays, it’s below 50%. Been checking it out the last couple weeks.

  • 3blind

    If you we’re to say each ‘star’ was $100, since the 13th, you’d be down over $2k…..

  • 3blind

    Yesterday is yet another losing day vs. the spread in ncaab….down $850 if every star is worth $100.

  • David Hess

    OK, there are two problems here:

    1) At typical -110 odds, we do not recommend betting our 1-star picks. There is an “About Our Star Ratings” link at the top of every picks page. Please check it out: … Also note that we show the projected cover odds for each pick, and the 1-star picks all have odds below 52.4% (needed to break even at -110).

    2) A couple of weeks is a miniscule sample for a low-margin activity like spread betting. Over that time frame, random variation is going to play a HUGE role in the results. Please check out our Prediction Accuracy history for a better representation of our models’ accuracy over the long term:

    If you don’t like the site/picks, feel free to go elsewhere.

  • David Hess

    [MODERATOR NOTE: We do not recommend this strategy. Please see our comment above.]

  • 3blind

    So, do you just recommend the 2, 3 star plays if each star was worth $100 or would you suggest only playing the 3-star plays? Thanks


    3blind, we don’t recommend anything. We publish computer picks and their associated odds to win as calculated by our math models, and then publish all of the prediction results for all sports dating back for as many years as we’ve been tracking. How you choose to use (or not use) that information is up to you.

    Stars are just a simple way for us to visually group picks; it sounds like you’re reading too far into them. You’ll see that a 55% pick will get 3 stars but a 54.1% pick will only get two. Those picks have nearly the same confidence, though. The most important delineation is that 2-star picks and above are picks the models expect to be profitable over the long term — that is, over multiple seasons and thousands of picks, assuming -110 payout odds.

    We’ve made over 10,000 picks at the 2-star or better level across all the sports we cover in the past four years, and have a profitable record. Not much else to say. Within that span, we’ve had and will continue to have losing days, weeks, months, and seasons — probability and randomness dictate as much.

    Given the types of comments you’re making, it’s pretty clear your approach to sports betting and your expectations are very different than what we do here. We’ve explained things enough, either use the site or don’t.

  • 3blind

    Gottcha, what you’re saying is your site is no better than the others. A dime a dozen. Thought this one was different….it’s obviously not. I’ll spread the word.

  • 3blind

    Wow, what a surprise, another two losing nights….

  • fijidreamer

    When you say a team has a 4% chance of winning it all….are you saying that out of 100 total points, that team gets 4 (which may be the highest single team rating) or are you saying that in your algorithm that team wins 4% of the simulations?

  • David Hess

    The second option — they won it all in 4% of simulations.