Sunday, July 13, 2014

Including Starting Pitchers in the MLB Model

A major flaw in the MLB version of the MDS Model is that it does not account for starting pitching matchups, which greatly affect each team's chances of winning. For example, the Chicago White Sox are a middling team, but when Chris Sale is on the mound, their odds of winning are much better than their record indicates. The inverse of this applies too; a good team with a bad starting pitcher will have worse chances than they would with an average (or above average) pitcher.

To account for pitching, I utilize a combination of the techniques proposed by FiveThirtyEight and Kenneth Massey: predict the outcome based on the two opposing teams' ratings separate from the two starters, then factor in the pitching.


First, Log5 is used (as always) to predict the initial win probability. I'll follow through with an example:
HOMEAWAYDATETB RateTOR RateLog5
TBTOR7/13/20140.4660.53443.25%

Now, home-field (0.3 runs) is accounted for. The runline is calculated by taking the inverse of the normal distribution of the win probability, multiplied by 3 (a negative number indicates the home team is favored):
Log5LineLine (Home)
43.25%0.510.21
Now that a margin of victory (and consequent win probability) has been calculated I factor in the starting pitching. In this example, R.A. Dickey is the starter for TOR and David Price is the starter for TB. Their ERAs are 3.83 and 3.32, respectively. Pitching is accounted for as follows:
The 2/3 factor is included because starters generally last about 6 (out of 9) innings. In this continued example, the pitching factor is 0.34 runs in favor of Tampa Bay. Thus:
Line (Home)PitchLineWin %PICK
0.21-0.34-0.1351.73%TB

Including the matchup of Dickey and Price actually flips the pick in this case, when also considering that the Rays are at home.

No comments:

Post a Comment