Probabilis: 2017

Saturday, December 9, 2017

Points Per Minute Deficit (NCAAB)

As any diehard fan can tell you, when your team is down, you won't count them out until a comeback just isn't feasible anymore, models and simulations be damned. But there has to be some degree of reasonableness, right?

Growing up a Carolina basketball fan, I always had certain heuristics in the back of my mind (instilled by my dad) on whether we could come back or not. The thinking went that if we could keep the score within the number of minutes left in the game, then we had a chance to chip away until we tied or took the lead at a rate of 1 point per minute. For example, if UNC trailed by 10 points with 9:50 left to play, then we needed to get to within 9 by 9:00 left to play. And then trail by 8 (or better) by 8:00 left, 7 by 7:00 left, etc. Note that this doesn't mean that the trailing team went on to win necessarily; all it means is that the trailing team either tied or took the lead at some point later in the game.

I aimed to determine if this heuristic had any basis in real-life results by looking at play-by-play NCAAB data for the entire 2016-17 season. I scraped KenPom's play-by-play win probability graphs to get scoring data on every possession, which ultimately gave me 393,719 individual possessions in 2nd half/overtime to analyze. Of those, 13,917 (3.5%) matched the criteria of being down x points with x minutes remaining.

There actually is a fairly linear relationship for this idea, with a largest deficit implying a lesser chance of a comeback (as you would expect):

Using linear regression, the relationship as shown above can be described as:

%_Comeback = -0.021*Min_Remaining + 0.409

This model has a fairly good fit too, with an r-squared value of 0.828.

For my purposes, I usually applied this thought process up until UNC trailed by 6 or less (a 2 possession game). At that point all you need is to make 2 threes and it's tied. However, based on the above graph it appears I should ONLY hold out hope when the deficit is within 2 possessions or less, since that is the only stretch in which the trailing team has better than a 1 in 4 chance of coming back. Good luck telling that to any fan of a team down by 7 or more though.

Friday, November 17, 2017

Free Throw % Splits Based on Shot Type

I've always been a solid free throw shooter: I shot 75% over 4,250 attempts during the insane three-year stretch where I tracked every single shot I took on the basketball court (and 85% over 630 attempts in 2016)). But when I played organized basketball when I was younger I would always brick free throws when I shot technical foul shots and was alone at the line (although never as badly as my intramural teammate who had an 0 for 34 stretch shooting free throws). Was this just an abnormal anecdotal observation, or does NBA play-by-play data back it up?

It's been previously established that free throw shooters improve from their first shot to their second/third. Nylon Calculus has a great database on this phenomenon from 1999-2016, and TrueHoop suggested in 2011 that the first attempt is like getting to practice free throws in the middle of the game. But what about other splits like when you're alone at the line, or when you can take only one shot (such as an and-1)?

I pulled down NBA play-by-play data for the full 2016-17 season from BigDataBall (a great source at a relatively cheap price for all play-by-play data back to 2006) and looked at every shot attempt and found the same thing previous studies had:

Free throw shooters get better on their second/third attempts, to a very high degree of statistical significance. Comparing the shooting percentage between the first vs second/third shot gives a z-score of 13.56, which has an associated p-value of 1. Even comparing first vs second attempts gives a z-score of 12.54, which has an associated p-value of 1 as well.

But what about my earlier considerations regarding being alone at the line (with no rebounders around you)? I looked at "normal" free throws (on an and-1, two-point shot, or three-point shot) compared with technical/flagrant/clear path foul shots where the shooter is "alone":

3% of all free throw attempts last season occurred where the shooter was "alone", and players shot significantly better in this case, going against my hypothesis. But these "alone" attempts include technical foul shots, which are taken by the best free throw shooter on the floor (the shooting team gets to choose who shoots them). If I remove technical foul shots I get a different picture:

A picture that illustrates no significant difference in make percentage (z-score of 0.35 with an associated p-value of 0.64, which is inconclusive).

My final look focused on only the first free throw taken in a set: does the shooter perform better or worse if they know they're getting additional attempts? I.E. Is an and-1 different from the first shot from a set of 2 or 3 attempts?

As before, I filtered out technical shots, and as before, there's no significant difference (z-score of 0.29, p-value of 0.61).

All in all it seems my experience was abnormal: NBA free throw shooters do improve after their first attempt, but make their shots regardless of whether there are other players around them. This makes sense, since they are professional athletes and I am not.

Sunday, November 12, 2017

Whether Punt Returners Should Return Punts Inside Their Own 10 Yard Line (NCAAF)

A couple of years ago I wrote this post assessing whether a kick returner should return the kickoff out of the end zone, stemming from my frustration when a college kick returner chooses to forgo the free 25 yards and tries to be a hero and run the kick all the way back. I concluded that the risk/reward balance was actually fairly even when accounting for turnovers. So I now have a new source of frustration to investigate from a game theory perspective: whether punt returners should return punts from inside their own 10 yard line.

I've always thought that a standard unwritten rule for punt returners is to plant yourself at the 10 yard line and let anything kicked over your head go into the end zone. Instead, I've observed many a punt returner attempt to return it from inside their own 10 (UNC's own Ryan Switzer would notoriously make me irate doing this). Are they making a negative risk/reward decision or am I wrong in my steadfast belief that it's a bad decision?

I gathered three full seasons of play-by-play data, using 2011-2013 (since that's what was readily available from this great Reddit thread for NCAAF data). There were almost 15,000 punt returns over this stretch, of which 1,387 were inside the 10 (9.3%). Whittling it further, 669 were actually fair caught (48%), leaving me with 718 punt returns that were caught inside the 10 yard line and returned.

The average punt meeting this criteria was caught at the 7.36 yard line and was returned for 8.98 yards, thus bringing the ball out to the 16.34 yard line. 76% of the time the returner gained yardage, while 12% of the time the returner lost yardage (the other 12% resulted in no gain).

Of the 718 returns, 10 times the punt was ran back for a touchdown (1.4%). On the flip side, 30 returners fumbled and lost the fumble (4.2%), and an additional 2 returns resulted in a safety (0.3%). On its face, it appears my intuition is correct: it's way more risky to try to run it back. But which option is optimal? Return it, fair catch it, or let it bounce towards the end zone?

Simulated World Series Preview 2017: LAD vs HOU

Over the course of my modeling career, I'm so far 0-2 when picking the World Series winners (I had KCR in 2014 and NYM in 2015). That's baseball for you, where the "favorite" only has about a 55% chance of winning a given 7 game series. This year's matchup features a matchup of two 100-win teams, the Dodgers and the Astros.

I've tweaked my simulator that I used in 2015 and simulated the World Series 10,000 times, and predict the Dodgers win it all in 6 games. So if you want to read too much into an extremely small sample size of 2014 and 2015, that means the Astros are winning it in 5 or 7.

My ratings are very in line with Baseball Prospectus, and have the two teams as fairly evenly matched:

LAD: 0.604
HOU: 0.592

So this series should go 6 or 7 games:

The pick: Los Angeles Dodgers in 6

Saturday, July 22, 2017

Preseason NCAAF Rankings for 2017

As I did last year, the year before, and the year before that, I've created a new set of preseason NCAAF rankings that take into account player turnover and recruiting classes. The following is literally copied and pasted from last year's write-up so you don't have to click that first link:

As before, I used my final Composite ratings from the MDS Model (from last season) as the base (which takes into account both a forward-looking predictive component and a past-performance only retrodictive component), and then factored in ESPN's Preseason FPI and the S&P+ projections, both of which take into account player changes on each team. Once the season starts, this "preseason" rating will be faded out as the season progresses, carrying less and less weight with each ensuing week.

In the below list, the "Trend" indicates whether the respective team's new ranking rose or fell relative to the average of last year's end-of-season Composite ratings. We have two new team's this year: Coastal Carolina (moving up from FCS) and UAB (welcome back!).

Rank	Team	PRESEASON	Trend
1	Alabama	0.924	UP
2	Ohio State	0.863	UP
3	Oklahoma	0.854	DOWN
4	Florida State	0.841	UP
5	Clemson	0.840	DOWN
6	LSU	0.800	UP
7	USC	0.770	UP
8	Michigan	0.767	DOWN
9	Stanford	0.761	UP
10	Washington	0.744	UP
11	Notre Dame	0.731	DOWN
12	Florida	0.728	UP
13	Tennessee	0.721	DOWN
14	Wisconsin	0.715	UP
15	Auburn	0.713	UP
16	Penn State	0.711	UP
17	Ole Miss	0.703	DOWN
18	TCU	0.702	UP
19	Georgia	0.701	UP
20	Louisville	0.695	UP
21	Oklahoma State	0.694	DOWN
22	Oregon	0.684	UP
23	UCLA	0.676	DOWN
24	Texas A&M	0.668	UP
25	Baylor	0.664	DOWN
26	North Carolina	0.663	DOWN
27	Miami (FL)	0.658	UP
28	Mississippi State	0.647	DOWN
29	Arkansas	0.643	DOWN
30	Michigan State	0.638	DOWN
31	Texas	0.634	UP
32	North Carolina State	0.630	UP
33	Iowa	0.629	DOWN
34	Washington State	0.620	UP
35	Pittsburgh	0.617	DOWN
36	Northwestern	0.604	UP
37	Houston	0.603	DOWN
38	Virginia Tech	0.603	UP
39	Brigham Young	0.601	DOWN
40	Boise State	0.600	DOWN
41	Utah	0.600	DOWN
42	Nebraska	0.588	DOWN
43	South Florida	0.586	DOWN
44	West Virginia	0.582	DOWN
45	Memphis	0.571	DOWN
46	Georgia Tech	0.566	UP
47	Kansas State	0.562	UP
48	San Diego State	0.562	DOWN
49	Western Kentucky	0.561	DOWN
50	Appalachian State	0.556	DOWN
51	Toledo	0.554	DOWN
52	California	0.553	DOWN
53	Texas Tech	0.553	DOWN
54	Arizona State	0.553	DOWN
55	Navy	0.542	DOWN
56	South Carolina	0.541	UP
57	Temple	0.528	DOWN
58	Arizona	0.527	DOWN
59	Duke	0.527	UP
60	Missouri	0.526	UP
61	Minnesota	0.525	DOWN
62	Indiana	0.525	UP
63	Syracuse	0.516	UP
64	Kentucky	0.508	UP
65	Western Michigan	0.506	DOWN
66	Vanderbilt	0.504	UP
67	Colorado	0.503	UP
68	Cincinnati	0.484	DOWN
69	Iowa State	0.476	UP
70	Wake Forest	0.471	UP
71	Virginia	0.462	UP
72	Colorado State	0.461	UP
73	Tulsa	0.460	UP
74	Southern Miss	0.457	DOWN
75	Bowling Green	0.456	DOWN
76	Boston College	0.454	UP
77	Maryland	0.454	UP
78	Oregon State	0.451	UP
79	Utah State	0.450	DOWN
80	Marshall	0.448	DOWN
81	Arkansas State	0.447	DOWN
82	Louisiana Tech	0.447	DOWN
83	Central Michigan	0.435	DOWN
84	Northern Illinois	0.433	DOWN
85	Air Force	0.426	DOWN
86	Illinois	0.425	DOWN
87	Georgia Southern	0.421	DOWN
88	Middle Tennessee	0.417	DOWN
89	Troy	0.405	UP
90	East Carolina	0.402	DOWN
91	Ohio	0.392	DOWN
92	Rutgers	0.391	DOWN
93	Purdue	0.390	DOWN
94	Southern Methodist	0.384	UP
95	UCF	0.365	UP
96	New Mexico	0.365	DOWN
97	San Jose State	0.358	DOWN
98	Nevada	0.355	DOWN
99	Wyoming	0.353	UP
100	Connecticut	0.344	DOWN
101	Florida Atlantic	0.343	DOWN
102	Ball State	0.335	UP
103	Georgia State	0.334	DOWN
104	Old Dominion	0.332	UP
105	Miami (OH)	0.332	UP
106	Akron	0.328	DOWN
107	UTSA	0.315	UP
108	Army	0.311	UP
109	Louisiana-Lafayette	0.311	UP
110	Tulane	0.310	UP
111	Fresno State	0.303	UP
112	Florida International	0.300	DOWN
113	UNLV	0.297	UP
114	South Alabama	0.297	UP
115	Massachusetts	0.293	UP
116	Eastern Michigan	0.289	UP
117	Idaho	0.286	DOWN
118	Kansas	0.285	UP
119	Buffalo	0.282	DOWN
120	Kent State	0.275	DOWN
121	Hawaii	0.270	UP
122	Rice	0.270	DOWN
123	New Mexico State	0.265	UP
124	North Texas	0.244	UP
125	UTEP	0.234	DOWN
126	Louisiana-Monroe	0.230	UP
127	Texas State	0.205	DOWN
128	Coastal Carolina	0.196	UP
129	Charlotte	0.193	UP
130	UAB	0.130	UP

Probabilis

Categories