## Monday, August 24, 2020

### "What are the odds?" An MLB Team Hits a Grand Slam Four Games in a Row

Last week, the San Diego Padres hit a grand slam in four consecutive games (all against the Texas Rangers) - something that had never been done before:

As CBS Sports points out above, there have been roughly ~407,000 games in MLB history - so what were the odds that it hadn't happened yet?

I simulated each game of the Padres/Rangers series (2 in Arlington, 2 in San Diego) and estimate the probability of hitting a grand slam in each game as:

 GameNum Team ProbGrandSlam 1 SDP 5.64% 1 TEX 2.27% 2 SDP 4.77% 2 TEX 2.16% 3 SDP 2.58% 3 TEX 1.63% 4 SDP 3.17% 4 TEX 1.54%

Multiplying each game together results in a truly unlikely series of events:

 Team Prob4InARow 1 in... SDP 0.000220% 454,489 TEX 0.000012% 8,124,789 Avg 0.000116% 860,825
So over the course of Major League history, it truly is unlikely it hadn't happened yet. There are slightly less four game streaks than games, since you the first three games in a season don't make a streak of four. So as an estimate, I removed 3 games per year times 117 years = 351 games, out of 407,000, gives 406,649 approximate sequences of four games in a row.

Using this San Diego/Texas series as a proxy, there is roughly a 99.999884% chance that a four game stretch does NOT have a grand slam in each game (1 - 0.000116%).

So 99.999884% ^ 406,649 four game sequences = 62.35% - the odds that this had not happened yet. Resulting in a 37.65% chance of making it this far in to MLB history without the feat occurring.

Therefore, it might be fair to guess that the baseball gods were therefore punishing the Rangers for griping about Tatis hitting the first grand slam on a 3-0 count late in a blowout.

## Sunday, August 23, 2020

### Effect of Serve Rules/Scoring in Ping Pong/Table Tennis

In ping pong, there seem to be three primary rule variations around switching which player serves:

How do these rule variations benefit/hurt the better player?

In all cases, I'm going to assume a best-of-5 series (first player to win 3 games wins), and each game is to 11, win by 2.

For calibration of the simulator, I'm assigning a "favorite" and an "underdog", where the "favorite" has to have a slightly higher chance of winning a point when they serve vs when they return:
• Favorite wins point on serve: 55%
• Favorite wins point on return: 50%
I then ran each rule set described above 10,000 times, with the "favorite" serving first. The probability the "favorite" wins a best-of-5 match:
• Switch every 2 points: 73.83%
• Switch every 5 points: 73.93%
• Switch serve on lost point, no point recorded: 76.69%
So switching serve when the returning player wins the point (and not recording a point) is a huge advantage to the better player - because it effectively lengthens the game, because points are only recorded while serving. It's a well known phenomenon that the shorter the game, the more randomness is exhibited, and the better the chances are for the underdog.

However, this flips when the "underdog" serves first, but only for the third rule set:
• Switch every 2 points: 73.58%
• Switch every 5 points: 73.77%
• Switch serve on lost point, no point recorded: 74.39%
The first two serving patterns (switching every 2 or 5 points) are more fair, since switching serve is independent of who scored points, and results in virtually the same win probability regardless of who serves first.

## Friday, August 21, 2020

### NBA Playoffs: Comparing Simulation Output vs SRS Model

Originally, I ran my play-by-play NBA simulator on this year's playoffs to estimate each team's chances, and then separately simplified those results to an SRS model so each team could easily be directly compared.

But if I run that SRS model back through the simulator, how would the predictions change?

The original projections were:

 Seed Conference Team Round 2 Conf Finals Finals Champion 1 East MIL 84.4% 49.2% 34.4% 22.4% 8 East ORL 15.6% 3.1% 0.9% 0.3% 4 East IND 18.3% 4.1% 1.4% 0.4% 5 East MIA 81.7% 43.6% 28.8% 17.5% 3 East BOS 65.0% 32.2% 11.0% 4.8% 6 East PHI 35.0% 10.3% 2.4% 0.7% 2 East TOR 83.0% 51.5% 20.0% 10.1% 7 East BKN 17.0% 5.9% 1.2% 0.3% Seed Conference Team Round 2 Conf Finals Finals Champion 1 West LAL 62.5% 28.2% 13.6% 5.4% 8 West POR 37.5% 12.5% 4.1% 1.1% 4 West HOU 44.3% 23.8% 11.0% 4.4% 5 West OKC 55.7% 35.5% 19.7% 9.4% 3 West DEN 46.3% 12.8% 4.7% 1.5% 6 West UTA 53.7% 19.6% 8.4% 3.0% 2 West LAC 60.6% 43.6% 26.3% 13.5% 7 West DAL 39.4% 24.0% 12.3% 5.2%

The SRS model then gave these relative ratings:

 Team MMult Matrix Rank MIA 3.74 1 MIL 3.72 2 LAC 3.11 3 OKC 2.32 4 TOR 2.05 5 DAL 1.76 6 LAL 1.44 7 HOU 1.09 8 BOS 0.61 9 UTA -0.18 10 DEN -2.18 11 POR -2.37 12 PHI -3.12 13 IND -3.51 14 ORL -3.68 15 BKN -4.10 16
So I then have to run these ratings through Log5, converting the expected margin of victory to a probability using a standard deviation of 13.47 in NBA, and then simulating each round again (or I can do the math explicitly).

For example, take the LAC/DAL series. The original simulation output had:
• LAC single game win probability: 54.88%
• Average MOV: 1.65
• Over a 7 game series, this is equivalent to: 60.57% series win probability
Now let's take the above ratings. We have to invert the first calculation:
• LAC rating - DAL rating = 3.11 - 1.76: 1.35 average MOV
• Normal distribution; mean = 0, standard deviation = 13.47, x = 1.35: 53.99% LAC single game win probability
• Over a 7 game series, this is equivalent to: 58.67% series win probability
• The full math on this is at the end of this post
Running this through the playoff bracket gives the following probabilities:

 Seed Conference Team Round 2 Conf Finals Finals Champion 1 East MIL 88.5% 48.1% 32.1% 19.9% 8 East ORL 11.5% 1.8% 0.4% 0.1% 4 East IND 12.0% 2.0% 0.5% 0.1% 5 East MIA 88.0% 48.0% 32.1% 20.0% 3 East BOS 72.8% 33.9% 11.0% 4.7% 6 East PHI 27.2% 7.0% 1.0% 0.2% 2 East TOR 84.1% 54.6% 22.3% 11.5% 7 East BKN 15.9% 4.5% 0.5% 0.1% Seed Conference Team Round 2 Conf Finals Finals Champion 1 West LAL 73.2% 34.9% 17.0% 7.0% 8 West POR 26.8% 6.7% 1.8% 0.4% 4 West HOU 42.1% 22.8% 10.6% 4.1% 5 West OKC 57.9% 35.6% 19.3% 9.0% 3 West DEN 37.3% 8.3% 2.3% 0.5% 6 West UTA 62.7% 20.7% 8.2% 2.6% 2 West LAC 58.7% 43.5% 26.4% 13.6% 7 West DAL 41.3% 27.6% 14.4% 6.2%
This gives the strange phenomenon where the Bucks are barely more likely to reach the conference finals than the Heat, yet the Heat are slightly more likely to make the Finals and win it all, as the Bucks are marginally more likely to win their first round series, and the Heat are only the slightest of favorites in each game over the Bucks.

Nevertheless, we get different results! Directionally they're almost the same (same picks in the first and second round), but there are large differences in magnitude in these early rounds.

 Seed Conference Team Round 2 Conf Finals Finals Champion 1 East MIL 4.1% -1.0% -2.3% -2.5% 8 East ORL -4.1% -1.3% -0.5% -0.2% 4 East IND -6.3% -2.1% -0.9% -0.3% 5 East MIA 6.3% 4.5% 3.3% 2.5% 3 East BOS 7.8% 1.7% -0.1% -0.2% 6 East PHI -7.8% -3.3% -1.3% -0.5% 2 East TOR 1.0% 3.0% 2.4% 1.4% 7 East BKN -1.0% -1.4% -0.7% -0.2% Seed Conference Team Round 2 Conf Finals Finals Champion 1 West LAL 10.6% 6.6% 3.4% 1.6% 8 West POR -10.6% -5.8% -2.3% -0.7% 4 West HOU -2.2% -1.0% -0.4% -0.3% 5 West OKC 2.2% 0.1% -0.4% -0.4% 3 West DEN -9.0% -4.5% -2.3% -1.0% 6 West UTA 9.0% 1.0% -0.1% -0.4% 2 West LAC -1.9% -0.1% 0.1% 0.1% 7 West DAL 1.9% 3.6% 2.1% 1.0%
Calculating Series Probability

Neutral court makes this calculation much easier - we can just calculate each possible outcome (winning in 4, 5, 6, or 7 games).

Take our LAC/DAL example: 53.99% LAC win probability in any game. We just have to calculate the following outcomes, multiplied by the number of possible combinations for each series:
• Win in 4: WWWW, 8.5%, 1 possible outcome
• Win in 5: WWWLW, 3.91%, 4 possible outcomes
• Think of it as 4 Choose 1 (nCr calculation): there are 4 places (games 1, 2, 3, 4) to put the 1 loss
• Win in 6: WWWLLW, 1.8%, 10 possible outcomes
• 5 Choose 2 = 10
• Win in 7: WWWLLLW, 0.83%, 20 possible outcomes
• 6 Choose 3 = 20

$=\frac{6!}{\left(3!\left(6-3\right)!\right)}$$\frac{= 20}{}$
 Outcome G1 G2 G3 G4 G5 G6 G7 Win Series Combos Total Prob Series Prob Win in 4 54% 54% 54% 54% 8.50% 1 8.50% 58.67% Win in 5 54% 54% 54% 46% 54% 3.91% 4 15.64% Win in 6 54% 54% 54% 46% 46% 54% 1.80% 10 17.99% Win in 7 54% 54% 54% 46% 46% 46% 54% 0.83% 20 16.55%