Thursday, October 20, 2016

The Sophomore Slump in the NFL

Whenever an NFL rookie has a remarkable first year, their sophomore campaign often seems to not measure up, and the player is labeled as going through the dreaded "sophomore slump". Sometimes this decline appears to foreshadow future problems (see: Robert Griffin III) and other times it isn't anything but a blip in a long-term successful career (see: Matt Ryan). 

There have been two notable cases of it this year: Jameis Winston and Marcus Mariota. Winston was selected to the Pro Bowl last year, and Mariota had a solid rookie year as well. However, both players have regressed in Year 2. Why do some players take a step back in their second year? Shouldn't they improve as they gain more pro experience?

I pulled the numbers on the notable "sophomore slumps" in the past 10 seasons, sticking to QBs only. Limiting myself to QBs allows me to use "QB Rating" to quantify their performance in each year (and more imporantly, compare each player to the average for that year).


PlayerTeamYearQB Rating Year 1QB Rating Year 2Difference
Jameis WinstonTB201684.275.9-8.3
Marcus MariotaTEN201691.588.3-3.2
Robert Griffin IIIWAS2013102.482.2-20.2
Sam BradfordSTL201176.570.5-6
Matt RyanATL200987.780.9-6.8
Average-8.9

On average, these QBs' rating fell almost 9 points from Year 1 to Year 2. What if we compare each player's performance to the average QB rating in each year?

PlayerTeamYearQB Rating Year 1QB Rating Year 2Differencevs Avg Year 1vs Avg Year 2
Jameis WinstonTB201684.275.9-8.3-6.8-10.9
Marcus MariotaTEN201691.588.3-3.20.51.5
Robert Griffin IIIWAS2013102.482.2-20.216.5-5.2
Sam BradfordSTL201176.570.5-6-9.7-13.6
Matt RyanATL200987.780.9-6.83.8-2.5
Average-8.90.9-6.1

These rookies were slightly above average in their first year, but do indeed check in well below average in their second year. This would seem to imply there is indeed regression and the "sophmore slump" is real! BUT...

PlayerTeamYearQB Rating Year 1QB Rating Year 2Differencevs Rookie Avg Year 1vs 2nd Year Avg Year 2
Jameis WinstonTB201684.275.9-8.37.5-4.5
Marcus MariotaTEN201691.588.3-3.214.87.9
Robert Griffin IIIWAS2013102.482.2-20.225.71.8
Sam BradfordSTL201176.570.5-6-0.2-9.9
Matt RyanATL200987.780.9-6.811.00.5
Average-8.911.7-0.8

The key word here is regression. Regression to the mean. In Year 1, this set of QBs was almost 12 points above the rookie average, but in Year 2, they simply regressed to the mean (almost literally, to less than 1 point below the 2nd Year average). This "sophmore slump" is real in the sense of declining year-over-year, but is simply another way of describing regression to the mean.

Tuesday, October 18, 2016

"What are the odds?" That I Get a Parking Ticket at FIU

"I've been parking in a construction lot at FIU since June with no problem. I got ticketed today for the first time, but need to continue parking there through December. Should I purchase a parking permit, pay for hourly/daily parking, or keep risking it?"

This question is a straight-forward cost/benefit analysis if we can gauge the expected cost of each of these options. So here are our parameters:

Timeline: June to December (28 weeks)
Frequency: 3 times a week
Cost of Parking Permit: $140
Cost of Daily Parking: $8/day
Cost of Parking Ticket: $20

Option A: Purchase a parking permit
The only factor to consider here is the cost of the permit, which is $140. Upon obtaining that, there is a 0% chance of getting fined further.

Option B: Pay for daily parking
19 weeks have elapsed already, with 9 weeks to go. At 3 times a week * 9 weeks * $8, that's a cost of $216. As with Option A, there is a 0% chance of getting fined further.

Option C: Risk it
This is the option that takes a bit more math. First we have to gauge the probability of getting caught. Through 19 weeks, they've gotten fined once. If parking patrol is actively looking for this same car now, then the probability of getting caught again is higher. However, for simplicity we'll assume it's evenly distributed: 1/(19*3) = 1/57 = 1.75%. Assuming the fine doesn't escalate for repeat offenders, we take 1.75% * 9 weeks remaining * 3 days a week * $20 = $9.47.

Clearly risking it is the superior strategy, given our assumptions and your level of risk aversion.

Monday, October 17, 2016

"What are the odds?" That My Softball Team Makes the Co-Ed Novice League Playoff for 1st Place

The following is a shameless plug for my Clark County Co-Ed Novice softball team in league 480SBM04 - Adult Softball League Co-Ed Novice Mon, "Analyze This":

We're currently 1 game out of first place with the final two games of the season tonight, and if there's a tie for first, those teams involved enter a playoff for first in the league. I applied the MDS Model to our league to gauge the "true" win percentage for each team by runs scored/runs against (adjusted for strength of schedule):


RankHome TeamRFRAPyth
1We Are Here To Drink200800.842
2Bunting for Pitches181850.800
3Analyze This170870.773
4Comic Relief1421080.623
5Pitch Better Have My Money1011490.329
6Pitches Get Stitches721250.267
7Somerset Stephanie741460.224
8Bright Futures Pediatrics231830.022

Here's the scenario: we're playing "Pitch Better Have My Money" twice, with the top 2 teams in the league, "We Are Here to Drink" and "Bunting for Pitches" also playing twice. If "Bunting for Pitches" can win one or both of those games and we sweep our series, we'll be tied in first place and will head to the playoff. 

So I calculated the odds of this happening, since that scenario is very straight forward. Using our Pyth ratings to project win probabilities, we have an 87.41% chance of beating our opponent in each game (favored by 4.93 runs in each game, with a 76.40% shot to sweep them 2-0). 

There's a 67.18% chance "Bunting for Pitches" beats "We Are Here to Drink" either once (in which case we would face "We Are Here to Drink") or twice (in which case we would face "Bunting for Pitches"). 

Assuming the two series are independent gives 67.18% * 76.40% = 51.33% chance of making the playoff.

Tuesday, September 27, 2016

The 100-Win Cubs and Competitive Balance in MLB

In today's article on ESPN about the 100-win Cubs, David Schoenfield writes, "(W)e’ve had more parity in the past decade, making 95- and 100-win seasons increasingly rare." He measures this by comparing the number of teams with 95+ wins since MLB expanded to 30 teams in 1998:


1998-2006: 44 teams with 95-plus wins
2007-2016: 33 teams with 95-plus wins

I sought to see if a more mathematical calculation of competitive balance (CB) backs this up. I've looked at CB in the NBA in the past, and I applied this same technique to MLB (since 1998) by calculating the Noll-Scully Measure (NSM), which is the ratio of the actual standard deviation in each to season to the "ideal" standard deviation for that season. The lower the NSM, the greater CB there is in the league, with a perfectly balanced league having a ratio of 1.
In the two periods Schoenfiled indicated, there is indeed a difference in the NSM: it is much higher from 1998-2006, which implies less competitive balance.

1998-2006: 1.95 average NSM
2007-2016: 1.72 average NSM

For perspective, this season it is currently 1.724, which is right in line with the last decade, an era of more parity.

Thursday, September 22, 2016

"What are the odds?" That a Survey with 27 Choices and 1000 Participants Will End in a Tie

"We have a survey of 27 choices. There will be approximately 1,000 people voting, and they can only vote once and only make one choice each. What are the chances of a tie?"

Even without doing any math, it's obvious the number of combinations over 27 choices and 1,000 votes makes it very unlikely that there will be a tie. However, we can determine this probability indirectly using simulation.

I assumed that exactly 1,000 people will vote, and that each choice has an equal 1/27 chance of being chosen (this is obviously not likely true, but for simplicity this is the most basic assumption we can make).

I wrote a simulator in Python, and ran 10,000 simulations and tallied how often the survey resulted in a tie for first place: 14.34%.

Saturday, August 27, 2016

Preseason NCAAF Rankings for 2016

As I did last year and the year before, I've created a new set of preseason NCAAF rankings that take into account player turnover and recruiting classes. The following is literally copied and pasted from last year's write-up so you don't have to click that first link:

 As before, I used my final Composite ratings from the MDS Model (from last season) as the base (which takes into account both a forward-looking predictive component and a past-performance only retrodictive component), and then factored in ESPN's Preseason FPI and the S&P+ projections, both of which take into account player changes on each team. Once the season starts, this "preseason" rating will be faded out as the season progresses, carrying less and less weight with each ensuing week.

In the below list, the "Trend" indicates whether the respective team's new ranking rose or fell relative to the average of last year's end-of-season Composite ratings. I know this is a day late, so the Hawaii and Cal game is not yet included (these are purely preseason ratings).

Saturday, June 18, 2016

Who Would Win a 1-on-1 Tournament of Today's Best NBA Players?

Awhile back, NBA Memes (Facebook, Twitter) posted the following question:


So I thought I'd try to answer this question. I wrote a pickup basketball simulator to simulate games using NBA stats (from Basketball Reference). It follows the following aspects of a standard pickup game (stats used in parentheses):
  1. If the defensive player steals the ball, the possession ends (Steal %)
  2. Else, what type of a shot is taken (3PA/FGA)
  3. If a 2-pointer, if the defensive player blocks the shot, the possession ends (Block %)
  4. If not blocked or a 3-pointer, if the shot is made (2PM %, 3PM %)
  5. If miss, which player rebounded the miss (OReb %, DReb %)
  6. If the offensive player rebounds the miss, the possession starts over
The following rules are enforced:
  1. Standard pickup scoring is applied: 2's and 1's (3-pointers count as 2, 2-pointers count as 1)
  2. No free throws. You don't shoot free throws in pickup
  3. The games are to 21, win by 2
Right off the bat, it's fairly obvious Curry (and the other 3-point shooters) have a big advantage due to the 2's and 1's rule (Grantland wrote about this scoring advantage). 

After simulating each possible 1-on-1 combination 10,000 times each, another wrinkle jumped out at me: one of these things is not like the othersI used this past NBA season's statistics, and Carmelo is past his prime at this point. He loses to every other player in this hypothetical tournament (and in most cases it's not even close), so for simplicity, I've dropped him to make an 8-team bracket. Here is the win probability matrix for every single possible matchup, including Carmelo:

Win ProbCurryLeBronLeonardDurantHardenDavisWestbrookGeorgeCarmelo
Curry0.00%57.36%58.86%64.82%69.33%66.30%71.07%74.60%82.80%
LeBron42.64%0.00%58.06%60.23%60.09%68.45%73.83%74.30%85.20%
Leonard41.14%41.94%0.00%55.67%57.34%56.93%65.05%69.02%78.56%
Durant35.18%39.77%44.33%0.00%51.96%52.83%60.53%64.91%73.34%
Harden30.67%39.91%42.66%48.04%0.00%53.63%57.66%59.46%72.35%
Davis33.70%31.55%43.07%47.17%46.37%0.00%61.86%63.25%74.58%
Westbrook28.93%26.17%34.95%39.47%42.34%38.14%0.00%55.34%64.04%
George25.40%25.70%30.98%35.09%40.54%36.75%44.66%0.00%58.61%
Carmelo17.20%14.80%21.44%26.66%27.65%25.42%35.96%41.39%0.00%

I seeded each player based on their predicted overall win totals, which created the following bracket:

SeedPlayerRound 1Round 2Round 3
1Curry74.60%49.97%30.28%
8George25.40%9.58%2.95%
4Durant51.96%22.20%10.04%
5Harden48.04%18.25%8.15%
3Leonard56.93%27.32%13.66%
6Davis43.07%17.00%7.11%
2LeBron73.83%46.17%24.36%
7Westbrook26.17%9.51%3.45%

Curry wins the tournament most often, but LeBron isn't far behind. The two of them combine to win 54.64% of the time, which favors the pair over the other six players in the field.

Per the win % matrix, the top 5 least competitive matchups (discounting Carmelo):

RankWinnerLoserWin %
1CurryGeorge74.60%
2LeBronGeorge74.30%
3LeBronWestbrook73.83%
4CurryWestbrook71.07%
5CurryHarden69.33%

And the top 5 most competitive matchups:

RankWinnerLoserWin %
1DurantHarden51.96%
2DurantDavis52.83%
3HardenDavis53.63%
4WestbrookGeorge55.34%
5LeonardDurant55.67%

Some interesting things that jumped out at me:
  • Per the simulations, advantages are not transitive. For example, Curry is favored over LeBron, but he beats Davis (66.30%) less often than LeBron beats Davis (68.45%). 
  • Including defensive performance is huge. Without it, Curry beats LeBron almost 70% of the time, as opposed to 57.36% in the final simulation.
  • The length of game matters. If the games are to 11, then the spread of competitive balance is much smaller (games are much closer to 50/50).
Finally, here are the associated predicted margins of victory for each matchup:

MOVCurryLeBronLeonardDurantHardenDavisWestbrookGeorgeCarmelo
Curry0.001.131.432.373.292.393.394.255.64
LeBron-1.130.001.241.331.682.783.794.075.99
Leonard-1.43-1.240.000.881.230.952.343.144.63
Durant-2.37-1.33-0.880.000.320.351.652.463.81
Harden-3.29-1.68-1.23-0.320.000.571.211.613.77
Davis-2.39-2.78-0.95-0.35-0.570.001.782.143.91
Westbrook-3.39-3.79-2.34-1.65-1.21-1.780.000.922.17
George-4.25-4.07-3.14-2.46-1.61-2.14-0.920.001.42
Carmelo-5.64-5.99-4.63-3.81-3.77-3.91-2.17-1.420.00