Probabilis: November 2013

Background

For the entirety of this college football season, I've been predicting games using an aggregate of multiple mathematical models, including Jeff Sagarin's, Kenneth Massey's, and many more (that's at least 70 to choose from). This creatively named "Aggregate Model" has performed very well over the course of the season, picking straight up (SU) winners correctly 78.70% of the time, and against the spread (ATS) winners 55.29% of the time. That's good enough for 19th out of 71 models listed by Prediction Tracker for SU, and 8th ATS. However, none of this is original: I'm simply utilizing other models (i.e. "the whole is greater than the sum of the parts"). I keep getting asked, "why don't you build your own model?" So I did.

My Mathematical Sports Model – The MDS Model

I built the model with the help of some friends, and thus its name: the MDS Model, for the trio of myself and my roommates: Mickey, Dylan, and Stock (and all three of us are Mathematical Decision Sciences (MDS) majors at UNC-Chapel Hill).

The Short Version

There are two components to the MDS Model:

• The retrodictive (meaning it explains past performance) Matrix Model only takes into account wins and losses and who you've played. This model is intended to be analogous to the computer models used in the BCS formula, as they are not allowed to take into account margin of victory.

• The predictive Pyth Model uses an adjusted pythagorean expectation based on each team's points for and points allowed. My idea is that the points scored against a bad team (say, Southern Miss) should not be weighted as heavily as the points scored against a good team (such as Alabama), and thus each game's outcome is adjusted based on the opponent's rating calculated from the Matrix Model.

• The third overall Composite Rating is then "a synthesis of the two diametrical opposites" (to quote Jeff Sagarin) that gives the overall rankings of all teams.

I use the Pyth Model to make predictions, and through one full week of predicting winners and spreads, it has picked SU winners correctly 73.58% of the time, and ATS 52.94% of the time. For comparison, this would rank 35th out of 71 models SU, and 56th ATS for last week (see all).

More impressive, however, is its performance in last year's bowl games (using data from the 2012 regular season/conference championship games). The MDS Model (predicting using the Pyth ratings) would have placed 9th out of 72 models SU and 21st ATS.

To see this week's rankings for the Matrix Model, the Pyth Model, and the Composite Model, click here!

One cool feature I've implemented is the "Simulator" tab: simply change the cells underneath Team A and Team B and choose whether the game is at a neutral site or not, and it returns the win probability and predicted spread for the game.

The Long Version – Methodology (with lots of math)

Matrix Model:

All credit goes to Mick for the original idea for this one, with input from Dylan (Mick laid the groundwork, Dylan punched holes in it, and then I came up with solutions). I put every FBS team (FCS teams are not included) into a 125 by 125 matrix, and then for each row, a team's wins are recorded in the corresponding column to the losing team. For example, the entry:

(Team A, Team B) = 1 if Team A beat Team B; else 0

Each game is recorded in this Win Matrix, and then a Loss Matrix is formed by taking the transpose of the Win Matrix:

Loss = Win^T

However, since each team doesn't play the same number of games (since FCS games are not taken into account), both matrices have to be adjusted based on each teams' number of games played. This results in an Augmented Win Matrix and an Augmented Loss Matrix, where each entry is formed like so:

Aug Matrix entry = Matrix entry/# of games played

In order to rate each team relative to other teams they haven't directly played, each matrix is then raised to a high power (I use 20). This achieves the following:

• It connects all the teams in a graph network

• It effectively simulates the season as if it has been played x additional times just as it has already been played (in this case, x = 20)

The entire augmented matrices are then multiplied by a 1 by 1 matrix (which sums each team's row), and these sum ratings are then themselves summed, and then each individual rating is divided by this overall sum. This new rating is then multiplied by the number of teams (in this case 125) and a constant that the ratings will be centered around. I use .15 so that all ratings will be between 0 and 1 (we will see why later).

Team Rating = Sum(Row of Team's Aug Matrix)/(All Ratings) * (125*C)

C = .15

0 ≤ Team Rating ≤ 1

This results in a representation of a more "correct" win percentage. The loss percentage is equal to 1 - Loss Matrix Rating.

For the overall Matrix ratings:

Matrix Rating = Weighted Average(1*W Matrix Rating, 2*L Matrix Rating)

This is designed to be analogous to the BCS computer ratings, in which undefeated teams are favored over teams with one or more losses.

Pyth Model:

Each game's score is then adjusted based on the opposing team's Matrix rating:

A Adj Score = A Score * Opponent Rating

B Adj Score = B Score * .5

For Team A, your points are adjusted based on your opponent's rating, and Team B (your opponent)'s points are adjusted with the assumption that you (Team A) are an "average" team with a rating of .500. To "smooth out" errors in the original Matrix ratings, the opponent ratings (as well as the default .5 rating) are raised to the 1/2 power, which diminishes strength of schedule but still takes it into account. The higher this power, the more weight is given to the strength of your opponent.

These adjusted scores are then used to create the Pyth ratings (for NCAAF I use the exponent of 2.15):

Pyth = (Adj Points For)^2.15/((Adj Points For)^2.15+(Adj Points Against)^2.15)
0 ≤ Pyth ≤ 1

The overall Composite rating is "a synthesis of the two diametrical opposites", and is formed as follows:

Composite = Weighted Average(1*Matrix, φ*Pyth)

φ = (1+√5)/2, i.e. the Golden Ratio

0 ≤ Composite ≤ 1

Predictions:

All ratings can be used to make predictions since they are between 0 and 1, although the only model designed to be predictive is the Pyth Model. To make probabilistic predictions I use Log5:

In order to predict spreads, I then use this calculated win probability and take the inverse of its normal distribution, multiplied by 13.5 (which I have found is the best fit for NCAAF):

P Spread = NormDistInv(Win %)*13.5

For home field advantage (HFA), I use 3.5 for NCAAF. Thus, from each team's Pyth rating, I can calculate the probability one team beats another and by how much, given that the game is played at either a neutral site or at one team's home field.

Tools Used

The matrices, augmented matrices, and all teams' schedules/results are updated automatically through code implemented in Google Docs, which pulls data from ESPN. Thus, the model is inherently Bayesian, as it updates when new information is acquired. I also use Matlab for the necessary matrix calculations (taking the matrices to a high power).

Limitations

The MDS Model is by no means perfect, as no model is (it's also my first crack at building my own). Thus, it has some drawbacks:

• Players missing playing time can not be accounted for. Since the model is entirely mathematical and based on a team's holistic performance, there is no way to account for injuries, suspensions, etc to a specific player.

• The "Predicted Spread" breaks down at the extremes past ~28 points. For example, Idaho (ranked 120th out of 125 FBS teams) played at FSU (ranked 1st) and in the MDS Model, FSU was favored by 38.97 (including the HFA of 3.5). By contrast, the Vegas spread was set at 57 (and FSU easily covered).

• FCS teams are (currently) not included.

• The Matrix Model tends to favor teams from "average" conferences with lots of teams close to .500 (such as the ACC), since a percentage of .5 is the maximum point of the following equation:

p(1-p); 0 ≤ p ≤ 1

Ex: .5(1-.5) = .25, .6(1-.6) = .24, .7(1-.7) = .21, etc

Performance (Results)

The MDS Model has only been in action for one full week of college football for this season, and in that week performed as follows:

• SU (Straight Up): 73.58%, 35th/71 models tracked by Prediction Tracker (this model being the 71st)

• ATS (Against the Spread) 52.94%, 56th/71 models tracked by Prediction Tracker

For the 2012 Bowl Season:

• SU: 68.57%, 9th/72 models tracked by Prediction Tracker (this model being the 72nd)

• ATS: 51.52%, 21st/72 models tracked by Prediction Tracker

In the BCS formula, the top and bottom computer rankings for each team are dropped. The Matrix Model is designed to be analogous to these BCS computer models, and since there are 6 models (and mine would make 7), on average my model should be dropped 7.14 times. Mine would have been dropped 9 times at the end of last season (after the conference championship games but before the bowls).

Conclusion

One notable feature of the MDS Model is that it can be applied to any sport; all that needs to be changed are certain parameters (except some additional work would need to be done for soccer). I plan to continue testing its predictions throughout this NCAAF season, and hope to soon build it for NCAAB!

Current Rankings/Simulator

Ratings Archive

Probabilis

Categories

Wednesday, November 27, 2013

The MDS Model