Monday, September 29, 2014

MLB Playoff Bracket and Inconsistency (Thanks to the AL)

This year's MLB postseason is full of inconsistency in how things could play out (per the Pyth component of the MDS Model):
  • Oakland is the best team but has the 4th best chance of winning it all
  • Across the board the AL is stronger, and the AL teams combine to win the World Series 58.78% of the time AND have home-field advantage... but Washington is the single most likely team to win it all because they only have to play an AL team once: in the World Series
  • The Los Angeles Angels are better than the Baltimore Orioles AND have home-field but are less likely to make it to the World Series because both Kansas City and Oakland are stronger than Detroit
  • Pittsburgh has to get out of the one game Wild Card playoff, yet is more likely to win the World Series than the team they lost the NL Central to, St. Louis
  • Using the straight rankings (and factoring in home-field) predicts a different bracket than the straight probabilities


The Matrix Model Reflecting Actual Win %

I've been curious all season as to how close the Matrix Model (which only takes into account wins and losses) would reflect the actual win percentages of teams at the end of the MLB regular season. With 162 games and all teams connected with very low degrees of separation, I assumed that the residuals between the output of the Matrix and each team's actual win percentage would be very small.
In the above chart, the teams are sorted by win percentage. This indicates that (overall) the Matrix suppresses the "true" win percentages of good teams and raises that of bad teams, meaning the model is biased towards .500. The largest residual (in absolute value) was 0.010, and the sum of all residuals is 0.001, indicating that they are small and centered around 0 as designed.

The aim of the Matrix Model is to adjust each team's wins and losses by the strength of their opponents, so I checked to see if this was the case by looking at the residuals of the teams with the highest and lowest strengths of schedule (basically to check if the model is consistent with itself).

SOS RankResidual
1NYY-0.002
2MIN-0.007
...
29SF0.009
30LAD0.010

Note: A negative residual indicates that the Matrix raised the team's win percentage, while a positive residual means the model lowered it. The above table indicates that a stronger SOS correlates with a higher matrix rating, which means the model is consistent.

Monday, September 22, 2014

Oakland's Slide: Regression to the Mean or the Cespedes Trade?

The Athletics were up 2 games on the Angels in the AL West on July 31, when Billy Beane dealt Yoenis Cespedes to Boston for Jon Lester and Jonny Gomes. Since then, they have been passed by the Angels and now sit 10.5 games back, facing an inevitable one-game wild card playoff, while the Angles have clinched the division. The Cespedes trade has been panned as the reason for this "collapse". I figured something more innocent was the reason for their slide in the standings: regression to the mean. The Athletics still have the best run differential in baseball (+150), although the Angels have significantly caught up (+149).

I calculated the Pythagorean-expected records for the Athletics and compared them with their actual records since the trade deadline:

As seen above, the A's actually have been unlucky (with respect to their expected record) the entire time, but this luck has gotten considerably worse since the trade. They've actually regressed farther from the mean, the opposite of what I expected.

I then compared the Angels's luck to that of the Athletics. As seen above, they're trending in opposite directions, and the wins above/below expected currently stand at LAA: +3.03, OAK: -9.63. The total difference between these is 12.66. So, if both teams were playing to their (considerably high) Pythagorean-expectations, Oakland would be up 2 games: exactly where they were on July 31. 

So is the absence of Cespedes the cause of this slide, since the team has actually continued to play worse? This article from CBS Sports provides more accurate reasoning as to why the A's have fared so poorly in the last month and a half: 
Did trading Cespedes weaken the A's offense? Absolutely. Is one man responsible for the club suddenly scoring 1.29 fewer runs per game? Not a chance. Even if you fully buy into the idea of lineup protection and all that, slumps like this take a total team effort.