Probabilis: 2013

Wednesday, November 27, 2013

The MDS Model

Background

For the entirety of this college football season, I've been predicting games using an aggregate of multiple mathematical models, including Jeff Sagarin's, Kenneth Massey's, and many more (that's at least 70 to choose from). This creatively named "Aggregate Model" has performed very well over the course of the season, picking straight up (SU) winners correctly 78.70% of the time, and against the spread (ATS) winners 55.29% of the time. That's good enough for 19th out of 71 models listed by Prediction Tracker for SU, and 8th ATS. However, none of this is original: I'm simply utilizing other models (i.e. "the whole is greater than the sum of the parts"). I keep getting asked, "why don't you build your own model?" So I did.

My Mathematical Sports Model – The MDS Model

I built the model with the help of some friends, and thus its name: the MDS Model, for the trio of myself and my roommates: Mickey, Dylan, and Stock (and all three of us are Mathematical Decision Sciences (MDS) majors at UNC-Chapel Hill).

The Short Version

There are two components to the MDS Model:

• The retrodictive (meaning it explains past performance) Matrix Model only takes into account wins and losses and who you've played. This model is intended to be analogous to the computer models used in the BCS formula, as they are not allowed to take into account margin of victory.

• The predictive Pyth Model uses an adjusted pythagorean expectation based on each team's points for and points allowed. My idea is that the points scored against a bad team (say, Southern Miss) should not be weighted as heavily as the points scored against a good team (such as Alabama), and thus each game's outcome is adjusted based on the opponent's rating calculated from the Matrix Model.

• The third overall Composite Rating is then "a synthesis of the two diametrical opposites" (to quote Jeff Sagarin) that gives the overall rankings of all teams.

I use the Pyth Model to make predictions, and through one full week of predicting winners and spreads, it has picked SU winners correctly 73.58% of the time, and ATS 52.94% of the time. For comparison, this would rank 35th out of 71 models SU, and 56th ATS for last week (see all).

More impressive, however, is its performance in last year's bowl games (using data from the 2012 regular season/conference championship games). The MDS Model (predicting using the Pyth ratings) would have placed 9th out of 72 models SU and 21st ATS.

To see this week's rankings for the Matrix Model, the Pyth Model, and the Composite Model, click here!

One cool feature I've implemented is the "Simulator" tab: simply change the cells underneath Team A and Team B and choose whether the game is at a neutral site or not, and it returns the win probability and predicted spread for the game.

The Long Version – Methodology (with lots of math)

Matrix Model:

All credit goes to Mick for the original idea for this one, with input from Dylan (Mick laid the groundwork, Dylan punched holes in it, and then I came up with solutions). I put every FBS team (FCS teams are not included) into a 125 by 125 matrix, and then for each row, a team's wins are recorded in the corresponding column to the losing team. For example, the entry:

(Team A, Team B) = 1 if Team A beat Team B; else 0

Each game is recorded in this Win Matrix, and then a Loss Matrix is formed by taking the transpose of the Win Matrix:

Loss = Win^T

However, since each team doesn't play the same number of games (since FCS games are not taken into account), both matrices have to be adjusted based on each teams' number of games played. This results in an Augmented Win Matrix and an Augmented Loss Matrix, where each entry is formed like so:

Aug Matrix entry = Matrix entry/# of games played

In order to rate each team relative to other teams they haven't directly played, each matrix is then raised to a high power (I use 20). This achieves the following:

• It connects all the teams in a graph network

• It effectively simulates the season as if it has been played x additional times just as it has already been played (in this case, x = 20)

The entire augmented matrices are then multiplied by a 1 by 1 matrix (which sums each team's row), and these sum ratings are then themselves summed, and then each individual rating is divided by this overall sum. This new rating is then multiplied by the number of teams (in this case 125) and a constant that the ratings will be centered around. I use .15 so that all ratings will be between 0 and 1 (we will see why later).

Team Rating = Sum(Row of Team's Aug Matrix)/(All Ratings) * (125*C)

C = .15

0 ≤ Team Rating ≤ 1

This results in a representation of a more "correct" win percentage. The loss percentage is equal to 1 - Loss Matrix Rating.

For the overall Matrix ratings:

Matrix Rating = Weighted Average(1*W Matrix Rating, 2*L Matrix Rating)

This is designed to be analogous to the BCS computer ratings, in which undefeated teams are favored over teams with one or more losses.

Pyth Model:

Each game's score is then adjusted based on the opposing team's Matrix rating:

A Adj Score = A Score * Opponent Rating

B Adj Score = B Score * .5

For Team A, your points are adjusted based on your opponent's rating, and Team B (your opponent)'s points are adjusted with the assumption that you (Team A) are an "average" team with a rating of .500. To "smooth out" errors in the original Matrix ratings, the opponent ratings (as well as the default .5 rating) are raised to the 1/2 power, which diminishes strength of schedule but still takes it into account. The higher this power, the more weight is given to the strength of your opponent.

These adjusted scores are then used to create the Pyth ratings (for NCAAF I use the exponent of 2.15):

Pyth = (Adj Points For)^2.15/((Adj Points For)^2.15+(Adj Points Against)^2.15)
0 ≤ Pyth ≤ 1

The overall Composite rating is "a synthesis of the two diametrical opposites", and is formed as follows:

Composite = Weighted Average(1*Matrix, φ*Pyth)

φ = (1+√5)/2, i.e. the Golden Ratio

0 ≤ Composite ≤ 1

Predictions:

All ratings can be used to make predictions since they are between 0 and 1, although the only model designed to be predictive is the Pyth Model. To make probabilistic predictions I use Log5:

In order to predict spreads, I then use this calculated win probability and take the inverse of its normal distribution, multiplied by 13.5 (which I have found is the best fit for NCAAF):

P Spread = NormDistInv(Win %)*13.5

For home field advantage (HFA), I use 3.5 for NCAAF. Thus, from each team's Pyth rating, I can calculate the probability one team beats another and by how much, given that the game is played at either a neutral site or at one team's home field.

Tools Used

The matrices, augmented matrices, and all teams' schedules/results are updated automatically through code implemented in Google Docs, which pulls data from ESPN. Thus, the model is inherently Bayesian, as it updates when new information is acquired. I also use Matlab for the necessary matrix calculations (taking the matrices to a high power).

Limitations

The MDS Model is by no means perfect, as no model is (it's also my first crack at building my own). Thus, it has some drawbacks:

• Players missing playing time can not be accounted for. Since the model is entirely mathematical and based on a team's holistic performance, there is no way to account for injuries, suspensions, etc to a specific player.

• The "Predicted Spread" breaks down at the extremes past ~28 points. For example, Idaho (ranked 120th out of 125 FBS teams) played at FSU (ranked 1st) and in the MDS Model, FSU was favored by 38.97 (including the HFA of 3.5). By contrast, the Vegas spread was set at 57 (and FSU easily covered).

• FCS teams are (currently) not included.

• The Matrix Model tends to favor teams from "average" conferences with lots of teams close to .500 (such as the ACC), since a percentage of .5 is the maximum point of the following equation:

p(1-p); 0 ≤ p ≤ 1

Ex: .5(1-.5) = .25, .6(1-.6) = .24, .7(1-.7) = .21, etc

Performance (Results)

The MDS Model has only been in action for one full week of college football for this season, and in that week performed as follows:

• SU (Straight Up): 73.58%, 35th/71 models tracked by Prediction Tracker (this model being the 71st)

• ATS (Against the Spread) 52.94%, 56th/71 models tracked by Prediction Tracker

For the 2012 Bowl Season:

• SU: 68.57%, 9th/72 models tracked by Prediction Tracker (this model being the 72nd)

• ATS: 51.52%, 21st/72 models tracked by Prediction Tracker

In the BCS formula, the top and bottom computer rankings for each team are dropped. The Matrix Model is designed to be analogous to these BCS computer models, and since there are 6 models (and mine would make 7), on average my model should be dropped 7.14 times. Mine would have been dropped 9 times at the end of last season (after the conference championship games but before the bowls).

Conclusion

One notable feature of the MDS Model is that it can be applied to any sport; all that needs to be changed are certain parameters (except some additional work would need to be done for soccer). I plan to continue testing its predictions throughout this NCAAF season, and hope to soon build it for NCAAB!

Current Rankings/Simulator

Ratings Archive

Monday, September 30, 2013

Post-ECU season outlook, and why it's worse than you think

There are a number of reasons why this past Saturday's loss to ECU was very, very bad and has a HUGE impact on our outlook going forward:

• ECU isn't very good
• It was at home
• It was an absolute blowout
• The Bayesian network of teams is now complete

The last point is the biggest. What I mean by that is that after Week 5, there are no (or very few) teams in FBS that are separated by more than 5 degrees of separation in a graph network containing all FBS teams. For example, UNC is 3 degrees of separation away from Florida. UNC plays NC State (1), NC State plays FSU (2), and FSU plays Florida (3).

Thus, the rankings given by Massey, Sagarin, and others should now be fairly accurate (before Week 5 they were heavily dependent upon data from last season).

After the ECU game, we shed over 10 points in Sagarin's Predictor rating, and fell 30 spots in Massey's overall rankings. The repercussions are as follows:

• 65.877% chance (per Massey) we win 5 or fewer games and don't make a bowl game
• Expected number of wins is now at 4.93 (had been around 7 to 7.5 thus far)
• Only favored now vs Boston College, vs Virginia, vs Old Dominion, and vs Duke (previously also favored vs Miami, at NC State, and at Pitt)
• We're staring directly in the face of a 1-5 start. To put that quantitatively, 60.37% chance we're 1-5 after the Miami game

It may be just one game, but it matters A LOT for projecting the rest of the season.

Monday, September 2, 2013

UNC's Season Outlook

Using Jeff Sagarin's USA Today ratings and Kevin Massey's ratings, I've projected out the rest of Carolina's season. Using the preseason rankings from both, things currently stand as follows:

Expected record: between 7-5 and 8-4

Games in which UNC will not be favored:@ Ga Tech (-2.5), @ Va Tech (-6), vs Miami (-0.5), @ NC St (-1.5)

Thursday, August 29, 2013

UNC vs South Carolina

Our chances tonight, per my models:

UNC victory: 16.80%
South Car victory: 83.20%

Average margin of defeat: 14.58

Saturday, July 27, 2013

New blog launched devoted to sports picks

I've decided to keep this blog entirely devoted to sports and statistics, and have now launched Probabilis Sports Picks where I'll post my daily picks.

Tuesday, July 23, 2013

The NL West is mediocre: and yet is actually overperforming

The NL West is currently the weakest division in baseball, with the Dodgers leading the division with a record of 51-47, only 4 games above .500. The worst part about the current state of things in the West is that 4 of its 5 teams are actually overperforming their expected records. I've reached this conclusion based on the Pythagorean Expectation (originally derived by Bill James) for each team, calculated from their run differentials. If each team was playing to their expected winning percentage, the standings would currently look like this:

1. Arizona Diamondbacks 50-49 .505 (+1)
T2. Colorado Rockies 50-50 .500 (-2)
T2. Los Angeles Dodgers 49-49 .500 (+2)
4. San Francisco Giants 44-54 .449 (+1)
5. San Diego Padres 43-57 .430 (+1)

As depicted above, the Rockies have underperformed by 2 wins, while each of the other 4 teams have slightly overperformed their expected records. As a whole, the division has won 3 more games than they should have, and still achieved their current level of mediocrity. If things had played out strictly by the numbers to date, the Diamondbacks would be in first place and primed for the lone playoff spot out of the division with a record 1 game over .500.

Monday, July 15, 2013

Adjusted conference rankings for the 2013 college football season

The latest round of conference realignment is in full effect, and so my roommate at UNC used Jeff Sagarin's NCAAF ratings from 2012 to calculate the new conference standings (using the "central mean" technique) for this upcoming season. Here is his ESPN page for citation (literally his only presence on the Internet). Let's take a look (2012 rank in parentheses):

2013 NCAAF Conference Rankings

1. SEC (1)

2. Big 12 (2)

3. Pac 12 (3)

4. Big 10 (4)

5. ACC (6)

6. AAC (formerly Big East) (5)

7. MWC (11)

8. MAC (9)

9. Sun Belt (8)

10. C-USA (10)

Keep in mind that the WAC no longer exists; they were ranked 7th at the end of last season.

Sunday, July 14, 2013

How crowd sourcing on STFC picks college basketball spreads well enough to turn a profit

When I analyzed the 2013 picks on Streak for the Cash, I not only categorized picks by sport and league, but also took game picks from those categories and split them again into the two most common gambling options: straight up (SU) and against the spread (ATS). I was mostly interested in picks against the spread, since any advantage that picks correctly more than 52.38% of the time will beat the typical Vegas line of -110. All categories against the spread actually did worse than 50%, except one: college basketball. The overall record for these picks was 98-81 (54.75%), well above the expected result of 50% and also above the critical 52.38% threshold. As before, the more confident the pick, the better the results: sides with 75-100% of picks went 72-30, winning a whopping 70.59% of the time.

It should be noted that the data I collected was only from January 1 on, so it only includes half of the season. However, had you placed a $110 bet on every consensus NCAAB ATS pick on STFC from January 1 to the end of the season, you would have made a $890 profit- a 4.52% return. Had you chosen to limit your bets to the more confident 75-100% consensus picks, you would have made a $3900 profit- a massive 34.76% return.

STFC and "sheep" pickers: why going chalk is actually the best strategy for most wins in a month

On Streak for the Cash, many players consider following the "sheep" pickers (those blindly following the favorite with the majority of the picks) a bad strategy. But are favorites really over favored by STFC players? This premise depends upon the reasoning behind what percentage should back each side of a prop: should the percentages accurately reflect the probability of each respective side winning, or should the larger percentage simply take the favorite? I.E. If the favorite wins 70% of the time, should 70% of picks back the favorite? Since the goal of STFC is to simply pick winners (and you don't have to consider value or losing money), the best strategy is to maximize your expected number of wins. With this in mind, the "sheep" are actually playing the ideal strategy: even if one side's chances are slightly above 50%, they're the better pick.

This theory is backed by the numbers: I analyzed 6,799 STFC picks between 2010 and 2011, and the favorite won 54.54% of the time. When I broke down these picks by sport/league, only in one instance did the favorite have a losing record: 9-13 (40.91%) in Auto Racing. In every other category, the favorite "sheep" pick won more often than the underdog. I also analyzed 1,649 picks from 2013, and this trend continued: 52.82% of the 2013 favorites have been winners. And the more confident the pick, the better the result: for all 2013 props with 75-100% of the picks backing one side, 56.22% of these favorites have won, compared to sides with 60-75% of the picks (46.29% correct) and 50-60% (47.09% correct).

Keep in mind that this strategy is ideal for picking most wins in a month, but not necessarily for getting a "streak". The premise behind Streak for the Cash is that each prop is close to 50/50, and you need 27 wins in a row to win the grand prize "stash". Even if you are able to find an advantage of picking the favorite that has a 55% advantage every time, the chances of getting 27 correct picks in a row are .0000097%. If each prop is truly 50%, this falls to .00000075%. Basically, you have to get extremely lucky no matter what "strategy" you employ. However, when the goal is to get the most wins in a month, you want to maximize your expected number of wins: and picking the favorite every time is the way to go.

Numbers Numbers Numbers!!!

I watch sports. A lot. My roommate at UNC and I have determined that, when we combine the screen time of our two TVs, we watch a total of 60 hours of sports in a typical week. So, I figured I should be generating some sort of output from this: and thus this blog was born.

I intend to write about statistics and probability, primarily concerning sports and perhaps the stock market and other topics that interest me (Carolina basketball will definitely be a focus when the season starts in November). I'm also currently refining a "betting system" I've been working on for any sport I have a model for: from MLB to NFL to NBA to even WNBA, and I'll hopefully start posting daily picks if my methods are successful.

My main influences have been Nate Silver, Ken Pomeroy, and Jeff Sagarin; their work is fantastic. Finally, regarding probabilities and picks, as a great mathematician at UNC once told me, "You can never be 100% certain!"

Categories