Saturday, February 28, 2015

Checking in on the MDS Model's Performance ATS

It's important to check one's work, and although the MDS Model's performance against the spread has been very good thus far (it would have ranked #1 in NBA for the 2013-2014 season, ranks 1st currently in NBA (56.40%), and ranks 1st currently in NCAAB (53.18%)), I needed to establish a concrete statistical test to determine if I'm just getting lucky, or if I'm actually achieving good results.

I installed an automatic z-score test for each sport with spreads (NBA, NCAAB, NFL, NCAAF). I just want to be fairly sure I'm actually picking winners better than 50% of the time, so I'm willing to accept findings at a p-value of 0.9 or better. The following results are in:

  • NBA: Win % (all-time): 54.85%, Games Picked: 608, z-score: 2.40, p-value: 0.992
  • NCAAB: Win % (all-time): 53.23%, Games Picked: 980, z-score: 2.02, p-value: 0.979
  • NFL: Win % (all-time): 54.70%, Games Picked: 293, z-score: 1.61, p-value: 0.947
  • NCAAF: Win % (all-time): 51.67%, Games Picked: 1156, z-score: 1.14, p-value: 0.872
Ironically enough, the one sport I can't conclude that I'm doing better than 50/50 by chance (with 90% confidence or better) is college football, the original sport I built MDS for.

Friday, February 27, 2015

"Throwing Out the Records" in Rivalry Games

A commonly held belief by sports fans is that in rivalry games, the performance of the teams up to that point in the season should be thrown out, since rivalries bring out the best from both teams. These games are generally exciting, so it seems plausible that they are, in fact, closer than expected. So should you really "throw out the records" and assume the teams are closer to evenly matched?

I looked at the past 10 years of results for the following college basketball rivalries:

  • North Carolina/Duke
  • Kentucky/Louisville
  • Cincinnati/Xavier
  • Indiana/Purdue
  • Villanova/St. Joseph's
  • Arizona/UCLA
  • Kansas/Missouri (now defunct)
  • Syracuse/Georgetown (now defunct)
I first sought to determine if the underdog actually does win more often than Vegas predicts. Converting the spread to an implied win probability for each game resulted in the favorites expected to win 68.37 of the 111 games analyzed, 61.60%. In fact, the favorites won 71/111, 63.96%: slightly more than expected.

But are these games closer than expected, and thus more exciting? The data actually does suggest this: the favorite failed to cover the spread 60.36% of the time. Testing against the null hypothesis that the spreads of these games are 50/50 results in a z-score of 2.22. The corresponding p-value is 0.987: giving fairly high certainty that these rivalry games actually are closer than the betting market predicts.

Thursday, February 26, 2015

Summary Statistics and Splits for Pep Bands Attendance for Each Section

Immediate disclaimer: the following conclusions are drawn from the data from the 2013-14 school year.

The ultimate goal of analyzing this pep band data is to determine the number of times the average Marching Tar Heel plays TAG (UNC's fight song) throughout their college career. I've been sampling all sporting events that the band has performed at that I've been to this year (so far, 36), and only need the figures from a softball gig to complete those findings.

But first, using the scanner data from 2013-14, I can generate some summary statistics and splits (breakdowns by section and gig type) for the volunteer rates for each type of pep band gig by calculating the average number of gigs a player goes to.

Overall, the average marching band member attended 8.47 gigs between fall and spring, which results in a rate of 0.47 volunteer gigs over the course of year (since 8 were required: 5 in fall and 3 in spring).
Note: the 4th gig in the spring is the spring football game, which is not included since the entire band performs at it.

What about only those who volunteer for more than their required gigs? The average "volunteer" went to 11.53 gigs over the course of the year, meaning I only tallied those who went to more than 5 in the fall and/or more than 3 in the spring. So, these volunteers attended an extra 3.53 gigs each.

The ultimate question is which section volunteers the most? I determined this based on how many gigs the average player in each section went to, and the Tubas come in first - by a wide margin. The average tuba player went to 12.02 pep band gigs, which is more than the average "volunteer".

All PB1Tuba15312.02
6Tenor Sax1208.22
8Alto Sax2067.51

Wednesday, February 18, 2015

The Isaac Curse is Real...

The Issac Curse is real, kind of.

The "Isaac Curse" is a pattern (identified by a friend named Isaac, or at least that's one of his many names) found in Carolina sporting events when one team is supposed to win 80% of the time (+/- 5%) and the opposite occurs (the underdog wins). These pregame probabilities were either from my Aggregate Model for NCAAF or a combination of the Aggregate and KenPom for NCAAB. For the 2013-14 school year, this seemed to happen every single time:

• ECU (UNC 79.95%, lost) in football 
• Belmont (UNC 81.49%, lost) Louisville (UNC 20.90%, won), UAB (UNC 82.94%, lost), Michigan State (UNC 17.06%, won), Texas (UNC 81.55%, lost) in basketball

Of course, I found this was another case of misremembering when I actually looked at the data. The 80% ± 5% pregame probability also occurred in 2013-14 with Virginia, Miami, Virginia Tech, Georgia Tech, and South Carolina in football, and Virginia, NC State, Maryland, Virginia Tech, and Duke in basketball, and the favorites all won.

So we can adapt the Isaac Curse to simply mean that underdogs win at a rate higher than their 20% chances imply. I looked at all instances over the past 2 years in which I have seen the favorite meet the 80% ± 5% criteria. In NCAAF, the underdog was 2-8: a win percentage of exactly .200. No Isaac Curse there. But for NCAAB, the underdog was 8-9: winning 47.06% of the time. Using the expectation of 20%, the underdog should have won 3.4 of those 17 games, and in fact won over twice that: 8.

Calculating a t-score for this sample resulted in t = 2.71, with an associated p-value = 0.9923. Continuing with the Isaac theme, his favorite number is 8 (which is why every 8 in this post is highlighted), so at α = 0.008, we can conclude that the Isaac Curse exists, at least for NCAAB: underdogs win at a higher rate than their 20% chances imply. Why is this; is it due to limitations in the models? That question I can't answer so simply. 

But it's a good thing the Curse is real: Duke is supposed to win 78% of the time tonight, per KenPom. Come on Isaac Curse!

Tuesday, February 17, 2015

An Homage to the Team Rebound (aka Dead Ball Rebound)

Here's something that often goes unnoticed in a basketball game: the team rebound. A "team" rebound, also known as a dead ball rebound, occurs when the opposing team knocks the ball out of bounds. A rebound is credited, but since no individual player gained possession, it's counted as a "team" rebound.

Most box scores don't count team rebounds, including ESPN and CBS Sports. Kenpom, on the other hand, does. Using data from TeamRankings, I determined the average percentage of total rebounds that are team rebounds: 9.57%. This is equivalent to 3.24 dead ball rebounds a game per team, so an average game will have 6.48 team rebounds.

For some reason, I find this statistic (team rebounds) fascinating. Even though North Carolina is a very good rebounding team (2nd in the country), I can't necessarily determine that UNC will acquire more team rebounds than average, since a team rebound can result from a tipped ball out-of-bounds, or a horrible shot that bounces off the rim at such a bad angle that it goes directly out-of-bounds. So I took the data. Turns out UNC games actually do have more team rebounds than average, at least this year: 7.92, which is 10.45% of the total rebounds in each game. Now I know something I didn't know I wanted to know.

Has UNC Lost More at Home in Recent Years?

Recently ESPN ran a graphic during an ACC college basketball game that showed Roy Williams's home win percentage (at Carolina) as 0.792. A friend and fellow member of the marching band texted me "I feel like we've seen an abnormally high number of the other 20 percent", i.e. having gone to just about every home game, we've seen more losses at home in our time at Carolina than Roy's home win percentage implies. Thinking back to Austin Rivers, blowing it against Belmont last year, and losing close to Notre Dame and Iowa this year, I came to the same conclusion.

Roy has been the head coach at UNC for the past 12 seasons, so (as a senior), I've witnessed 4 of these. In these 4 seasons, our home record has been 57-9, which is actually 0.864. I've seen more Carolina home victories than expected, based on Roy's home win percentage. In fact, the Austin Rivers buzzer beater was the only home loss we had my freshman year.

However, my friend is a sophomore, so we'll look at only the past 2 years. While this year we were 9-3 at home (0.750), over the past 2 seasons we're 24-6, a win % of 0.800. Over 30 games, we would expect to see 6.24 losses, per Roy's home win percentage. Thus, losing 6 of those 30 games is actually spot on with this expectation.

This is an example of how we, as humans, remember unexpected events more vividly than the "norm", and so we overestimate how often these events occur. That, and many players, coaches, and fans alike share the notion that they remember the tough losses more intensely than the wins.