Today on their new “Regressing” blog, Deadspin republished a blog post by Michael Lopez that looked at NFL point spread pick performance so far this season by some of the higher profile data-driven prediction sites, TeamRankings included. What originally was meant to be a few sentences turned into a blog post of its own, so I’ve put it here for everyone to read, and am interested to hear any comments.
I had seen Michael’s original post yesterday on his blog, and was pondering a response, but got distracted by other work. Once the piece hit the front page of Deadspin, though, I felt compelled to reply. I actually thought the premise of the article was quite interesting: Are number crunching sites really doing poorly this year on NFL betting picks? If so, how widespread is this phenomenon and what could be some possible reasons for it?
However, the more into it I got, the more problems I had with the actual content of the piece. So if you’re curious to hear “the other side of the story,” read the link above, and here’s my response below.
Update 10/25 @ 11:46 pm ET: Both Michael Lopez and Kyle Wagner from Deadspin responded to the comment I left on the piece. I’m all for civilized debate, and you can see some of Michael’s rebuttals in the comments below. I also did some more thinking about this piece over the last few hours, which is reflected in an email I just sent to Kyle. I’ve pasted that email in directly below. My original response to Michael’s post appears afterwards.
Just saw your reply to my comment today. Happy to chat some more if you’d like. As I hope I got across to Michael, I’m completely fine with the premise of pointing out what you guys think is an odd anomaly in early season ATS pick performance from a group of popular numbers-driven prediction sites, and I do think there is some interesting stuff here in terms of the analysis Michael did.
Obviously sites like TR, FO, and NF are under the microscope because we sell our picks, so the fact that two of those three seem to be doing quite poorly and one is 45% ATS picking all 100 or so games so far this year is probably newsworthy to some degree, sure. (Selfishly, I’m annoyed we were grouped in with two sites doing significantly worse than us so far, but oh well.) However, the basic takeaway — that because two of the grand total of three numbers-driven prediction services whose early season results you happened to look at appear destined for a sub-50% season for one type of betting pick in one sport, “It could be a sign that these websites aren’t all they’re cracked up to be” — is a misleading and uninformed generalization.
More thoughts here:
- First of all, TeamRankings is no more similar to Football Outsiders than Deadspin is to ESPN. Our methods are not the same. We both use data to predict games, kind of like how you and ESPN both use words to write about sports. Lumping us all together into “these websites” is pretty silly, and certainly not something I’d expect from an analytics-themed blog that I would hope would actually be interested in learning and writing about how sites like TR and NF and FO approach the challenge of data-driven game prediction in unique and innovative ways. Sentences like that one above read like some sports radio jock referring to “all those propeller head stats guys.” You guys need to set the bar higher than that.
- What are we “cracked up to be” exactly? We’d be the first people to tell you that NFL spreads are probably some of the more efficient lines in the betting marketplace, and that we’d expect much more value to be found in other sports or types of bets. And even though we’ve done pretty well in recent years with them, our NFL spread picks are just one small part of our service. You want to maximize your ROI from sports gambling? Get in as many office pools as you can. And use our office pool analytics to win them.
- Are TR, NF, and FO truly a representative sample of all the various stat-driven methods being used to bet on NFL spreads today? The next two “professional” services mentioned in the article are apparently doing pretty well, which weakens that assumption. Also, take a look at The Prediction Tracker, where it appears that a number of quantitative predictors are actually beating NFL spreads so far this year. Finally, including Greg Matthews in the data set is arbitrary, even though he’s a nice guy and a smart dude. I’d guess there are at least several hundred other Greg Matthews types out there using math and data to build systems to try to beat NFL point spreads, and you’re sampling one of them. In summary, you’re making these broad statements (“if you’ve used statistics websites for your NFL picks, you’ve had an unsuccessful Fall”) when in reality, it may be just selection bias.
There are a few other random tidbits in there I have issue with. “While ATS picks should aim for hitting a 55% cutoff to make money” is a pretty ridiculous statement; the much lower 52.4% is the break-even win rate at standard -110 juice, and if you could pick every single NFL game at a 53.5% rate against the spread, say, your long term returns would be great compared to alternative investment opportunities like stocks. But between this email and my lengthy comment earlier, I think I’ve gone on long enough so I’ll cut it off there.
ORIGINAL RESPONSE TO MICHAEL’S POST
Guys — Tom from TeamRankings.com here. I think the concept of this piece is fine. It’s interesting news that as a group, the higher profile number crunchers don’t seem to be doing very well so far this year for NFL spread picks.
The deeper the piece goes, however, the more trouble it gets into. A nice angle would have been to explore reasons why this might be the case. Is the public cleaning up this year, meaning that more objective approaches that typically find more value in underdogs are doing worse? Etc. Instead, it devolves into just another significantly flawed comparison of handicapper records.
At the highest level, it’s misleading to generalize “NFL stathead picks” to “NFL point spread picks for all games during an eight week span in 2013,” which is how you are judging us here. Here’s what you conveniently overlook when you select your data set and endpoints like that, and also just rely on each site’s posted records to draw conclusions:
1) You ignore other types of picks, like totals, which are widely recognized as less efficient lines than spreads. If you’re a serious gambler looking for your greatest edge anywhere you can find it, you don’t exclusively focus on spreads. We also happen to do a lot better historically on NFL totals than spreads. Our customers really don’t care how we make them money, as long as we make them money.
2) Only a subset of our picks are considered “playable.” Our models predict every NFL game against the spread, but the majority of our betting predictions don’t see enough edge to overcome a typical -110 sports book vig, where a 52.4% win rate is required to break even. We still publish a spread pick for every game, because users want to see picks for every game. But if our spread pick has 51% confidence, say, our recommendation is clearly to pass; on TR, “pass” games are our 1-star picks. You included these in your analysis, while you really should only look at our 2- and 3-star picks. (We’re 24-26-1 for playable NFL spread picks so far, btw, so we’ll see how they end up. There’s a lot of season left. However, our playable spread and totals picks combined are hovering right at the profitability line for the season, thanks to totals being 23-17 so far.)
3) Along those same lines, you’re not comparing apples to apples when it comes to win/loss records. For example, you’re drawing NumberFire’s record from their posted results page, which only shows results for the subset of NFL picks they deem as “playable,” but you’re taking our posted record for all picks, including ones where we recommend passing. This is a major inconsistency in your methodology.
4) The other major risk you run here is that there is no guarantee at all in these comparisons that the spreads that each site uses to calculate their own ATS record are consistent. We use lines from Pinnacle Sports. Other sites don’t. We report our ATS performance against closing lines (a slight generalization, it’s really “closing or near-closing lines,” see David’s comment below for the longer explanation), which are tougher to beat than opening lines. Other sites don’t. Unless you are actually subscribed to each service and making sure you are comparing picks at exactly the same line at exactly the same time during the week, your underlying data set here is going to be at least slightly, and possibly seriously inconsistent.
I think it’s great to have industry watchdogs. Unsavory characters have given the handicapping market a horrible reputation, deservedly, and we are determined to bring full transparency to the space. The prediction accuracy pages on our site you linked to are the most comprehensive on the web, and we diligently report on all wins and losses over multiple years.
When you guys push out hastily researched pieces like this, though, you use your significant platform in a way that misleads readers, and leads them to conclusions like, “See? These TR guys are no better than a monkey throwing darts.”
That’s clearly false. Yes, we’re going to have losing days, weeks, months, and even seasons, just like everybody else trying to predict the future of sports betting. Anyone who bets on sports and doesn’t expect some significant losing streaks over time is being completely unrealistic. But there is no way a site like TeamRankings remains in business for over 10 years if our users aren’t finding some sort of edge from our service. We predict lots of things, and not everything well. But there are enough sweet spots (NFL totals, college basketball totals, MLB money line picks, our office pool picks for football, our March Madness bracket picks) that have driven great returns across multiple seasons and large sample sizes of picks.
Narrow the focus down to one sport, one type of pick, and a sample size of 50-100 games, though, and sure, thanks to randomness, almost anything can happen.
In the meantime, the average sports bettor typically has no idea how bad he or she is, because they only remember the wins. In the long term, over a statistically significant sample size of games — and unlike TeamRankings in our “sweet spots” — most people don’t even come close to winning 52.4% of their bets. I’ll submit this proof: Vegas sports books are still in business.
To bring this all back home, I think it’s interesting several number crunchers seem to be having a rougher start with NFL ATS picks this year. But that’s all anyone should take away from this article. There are no meaningful conclusions to be drawn. It’s just news.
(And it would be a bit ironic if we all ended up positive for the season for NFL spread picks that we actually recommend as playable.)
P.S. I commend Deadspin for starting the Regressing blog and helping to further popularize sports analytics. This is a cool idea. However, I think this idea also comes with some huge challenges. What you really need to make a blog like this truly credible, I think, is a panel of independent, skilled “math editors” who do nothing by try to shoot holes in each author’s draft, and who make sure that the methodology and data set used don’t have any significant flaws.
(This is nothing against Michael Lopez personally; all analytics practitioners have many liberties they can take when gathering data or performing analyses, and it’s always hard for just a single person to think truly holistically about an analytical task and cover every single base.)
Being in the content business myself, though, I also realize that hiring math editors is clearly not a viable business model in today’s media landscape, at least for all but the biggest players. In fact, it’s the primary reason we didn’t start our own blog like “Regressing” — I thought the editing burden, especially for pieces submitted by outside contributors, would be FAR too great if it had to include not only words and language but numbers too.
FWIW, I think this is going to be a huge problem in general for the analytics movement, since the general public can’t do a good job distinguishing a great from a passable from a laughable quantitative analyst. Nate Silver being the most likely exception, several popular numbers-focused writers in our general space (Bill Barnwell and Malcolm Gladwell come to mind) seem to be regularly lampooned by people I would consider to be highly skilled analytical practitioners. (By the way, I’m pretty much the “business” guy at TR, and I don’t include myself in that group.) Sometimes these critics are a bit harsh — after all, many are ornery and highly opinionated academic types — but regardless, they tend to be right more than they are wrong in their criticisms of the popular media’s forays into stat-centric articles and blog posts.
There are just a lot of situations out there in media today where editors who are clueless about numbers are calling the shots regarding articles about numbers, and that’s a dangerous recipe. Without skilled “math editors,” you’re at a high risk for inadvertently publishing bad numbers, which is almost always worse than publishing no numbers at all.