Probabilis: January 2015

Tuesday, January 27, 2015

The Nostalgia of Backyard Football 2002

I grew up playing Backyard Sports. Backyard Baseball 2001 was my first video game, and I've played every sport Humongous Entertainment made (except for hockey). As much as I love both Baseball 2001 and 2003, Football 2002 will forever be my favorite of all-time. And although we're all 21+ and seniors in college, my friends and I still regularly play it... And often debate who should be the first overall pick: Pablo Sanchez or Pete Wheeler. Pablo is the best all-around player (and Deadspin determined he was on steroids??) in all of the games, but you draft Pete first and this is why:

Sunday, January 18, 2015

Fitting the Exponential Distribution to TV Timeouts in College Basketball

My "dream" is to be in attendance at a college basketball game in which an entire TV timeout is skipped: i.e. the under-16, under-12, under-8, or under-4. I include called timeouts in this; in other words, I want to go through the entirety of one of these 4-minute intervals without a timeout occurring.

My assumption is that the timing of these timeouts is a Poisson process, and thus follows the exponential distribution: the closer to each 4-minute mark, the more likely a timeout is to occur. I took a clustered simple random sample of 60 games (resulting in 480 data points, since there are 8 TV timeouts a game) by clustering on each day of the season and then randomly choosing one game every two days.

I graphed the frequency of how long it took for a timeout to occur after each 4-minute mark, and the distribution does appear to be exponential:

Over the 480 instances I sampled, 4 times there was not a timeout during one of these 4-minute stretches, which is equal to 0.83%. However, this is equal to 6.67% of games (since 8 timeouts occur each game), or 1 in every 15 games, which is somewhat often. In my time at Carolina, I've attended 44 men's basketball games (home, away, and neutral site), which means I should have seen this occur roughly 2.9 times (and it never has). So how likely is it that I haven't witnessed a skipped TV timeout?

Monday, January 12, 2015

Was Gene Chizik a Good Hire at DC?

The S&P+ Ratings are "a college football ratings system derived from the play-by-play data of all 800+ of a season's FBS college football games (and 140,000+ plays)", and one of its components is "Success Rate":

Success Rate: A common Football Outsiders tool used to measure efficiency by determining whether every play of a given game was successful or not. The terms of success in college football: 50 percent of necessary yardage on first down, 70 percent on second down, and 100 percent on third and fourth down.

I logged every single defensive play from UNC's season, and our overall success rate was 50.38%, well above the FBS average of 41.33% (for a defense, a lower success rate is better). As I had found previously, our defense was very bad on 3rd and 4th down:

Totals
Down	Success	Count	Success Rate
1	206	415	49.64%
2	152	313	48.56%
3	99	190	52.11%
4	12	13	92.31%
All	469	931	50.38%

My ultimate goal was to compare our defense this year with those under Gene Chizik, who was recently hired as UNC's new defensive coordinator, using the S&P+ Defensive Rating as well as others. That success rate ranked 122 out of 125, and our overall S&P+ Defensive rating ranked 113th.

Saturday, January 3, 2015

The Stat Sheet from Roy Williams for UNC-Wake Forest, 2/22/14

For last year's home game against Wake Forest, Roy Williams sent over the official box score to the band for me at halftime, and so I analyzed some of the stat sheet to take a deeper look at some things.

KenPom Win Probability Model

I wanted to see how things like field goal droughts (which obviously correlate with less scoring) affect the in-game win probability in KenPom's Probability Graph. Unfortunately, the Wake Forest game isn't a great case study for this, since UNC's initial win probability was 90.1%. Wake Forest's longest first half FG drought lasted from 17:53-13:46, and their win probability went from 11.9% (coincidentally their highest of the game) to about 6%.

During this time, their win probability steadily dropped (as UNC's rose), until UNC went cold too. While 12% to 6% is only a 6% change, it represents a decrease by half for Wake Forest's chances.

North Carolina's longest FG drought occurred from 7:10-4:07, during which their win probability changed maybe 1%. There are two reasons for this: firstly, they were up 15 before the drought started, but more importantly, they continued to score off of free throws. However, as can be seen in the chart, this drought did allow Wake Forest to gain leverage (blue indicates the lowest leverage, purple indicates an increase).

Chance and Free Throws, 3-Pointers

The first half of this game was notable because UNC shot remarkably well from both the free throw line and beyond the arc. They went 4-5 (80.0%) from three, and 16-17 (94.1%) from the line; prior to that game, UNC was 31.02% from three and 62.30% from the line on the season.

First, the free throws: North Carolina was a notoriously bad free throw shooting team last year. They finished 62.6% for the season, which placed them 343rd out of the 351 teams in Division 1. But given that they were on fire in the first half of this game, could we theoretically conclude that the 62.30% entering the game was too low? To determine this is easy by calculating a t-value and corresponding p-score.

The sample variance of a proportion is: s² = pq / (n - 1); p = proportion, q = 1 - p, n = sample size
So,

Thus t = 5.398, with degrees of freedom = 16. p is less than 0.0001, so based on this sample, we can conclude with an extremely high level of certainty that the team was cumulatively better than 62.30%.

However, let's look at things from another angle with regard to the threes: 5 is a very small sample size, so does making 4 of those mean we could still theoretically conclude we were a better team than 31.02%? We finished the year at 33.6%, so should we check that t-statistic too, since 80% is a very high rate?

For the pregame 31.02%, t = 2.449, and p = 0.035. At the commonly used 95% confidence level, we can conclude that Carolina actually is a better team than 31.02%.

For the season 33.6%, t = 2.32, and p = 0.041. Once again at the 95% confidence level, we can conclude Carolina was actually even better than the season average from three.

However, this is where we should apply the "smell test". Should we really conclude that, as a team, UNC was actually definitively better than their season-long average, simply because Leslie McDonald got hot in the first half of a game against a very weak Wake Forest team (3rd to last in the ACC), went 3-3, and Marcus Paige added another and went 1-2? Wake Forest's 3-point defense was actually 38th best in the nation, but it still probably isn't accurate to conclude that Carolina was really a better team than 33.6%. There's large variance in the short-run in anything, and chance played a large role in this 4-5 stat line.

Thursday, January 1, 2015

UNC's Defensive Splits on 3rd Down

On December 11, UNC fired their defensive coordinator, and it was completely justified:

First, some quick hits on how bad our defense was overall (remember, there are 125 teams in Division 1 FBS):

UNC ranked #117 in Total Defense, measured by YPG given up
UNC ranked #119 in 1st Downs given up
UNC ranked #122 (4th from last) in 3rd Down Defense, measured by percentage of 3rd downs converted by the opposing offense
UNC ranked dead last in 4th Down Defense, measured by percentage of 4th downs converted by the opposing offense

To break it down further, when you remove the Liberty game and only include FBS teams, UNC gave up 1st downs on 50% of their opponents' 3rd downs. Only three other teams in all of FBS gave up 50%+.

Being at the ECU and Clemson games in person (two games in which we got torched defensively (although you could say that about most of our games)), I noticed that we often brought in a different personnel unit on 3rd down (we gave up 70.59% and 55.56% of 3rd down conversions in these games, respectively). To me, it seemed like we would do a decent job on 1st and 2nd down, and then give up the big play on 3rd down with the new unit. I get that different units have different specialties, and you've gotta get players rest, but if the same group continuously gets burned on 3rd down, why keep putting them out there?