MLB Line Projections – 19 Apr 21

[Disclaimer] There have only been a handful of games played for each team which limits the amount and quality of stats (and these are stat based projections). This makes the first month of projections WILD. Best of luck!

These are statistical projections and shouldn’t be the only thing that factors in to betting on a team, the stats only tell part of the story. Keep an eye on injuries, COVID Issues, and players returning from injury.

Like what you see? Please subscribe or follow me on Twitter (@AnalyticsB2) for the latest news and post info. Want to support the statistical data or have suggestions for improvement, feel free to send me an email (B2SportsStats@gmail.com) or you can donate via the website, Venmo (@B2stats) or Cashapp ($B2stats).

Next Project: Projections for MLB DFS. Look for periodic updates via twitter. I’m shooting for a mid May release.


*After my return from vacation on 22 Apr I will begin posting more and more data similar to NBA. and hopefully come May I will have DFS projections for baseball.

NOTE: The code that pulls Vegas game lines, may not be able to distinguish the double headers so the two listed lines may be the same and subsequently wrong for 1 of the games (Pulls the data based on team name).

All double headers are accounted for being 7 innings.

Advertisements

How it Works

All training data is pulled from every game played this season. The model is based around stats per 9 innings allowed by the starting pitcher and bullpen. So for starters, it take that particular starters stats vs the opposing teams batting stats averaged over the last 10 games for training for that particular game and all stats are adjusted for per 9 inning stats. In English that means that if starting pitcher pitches 3 innings, gives up 3 hits and 1 runs to a great batting team, then extrapolating over an entire game the starter would allow 9 hits and 3 runs against a great hitting team (Same way ERA is calculated). Same process is used for calculating the bullpen stats. Using per 9 inning stats allows to adjust easily for different number of innings pitched by the starters.

Some things to consider for MLB Projections: The batting stats are based on team stats from the last 10 games so like the NBA, they don’t account for recent injuries or return from injury. Bullpen stats are based on any all non-starting pitchers for a given game (i.e. if Kershaw comes out of the bullpen along with 2 other relievers for the Dodgers all 3 of their stats will be averaged and listed as Dodgers bullpen stats for that game). Starters who have never pitched before by default get the team average pitching stats (never pitched a game hard to tell how good/bad they will do). Starters stats used to project todays scores are averaged over the last 5 starts, batters are averaged over the last 10 games, bullpen is averaged over the last 10 games.

Stats used: Starting pitching vs Opposing team batting, bullpen vs opposing team batting, ground balls vs flyballs vs line drives (both allowed by pitcher and on average by the batting team), home vs away, day vs night games, left vs right hand pitching (starters only), and statcast pitch & batting stats (Currently broken, but close to fixed).

Advertisements

Summary of Projections

Model Record

Model Rank

Consensus Record

Consensus Profitability

Model Consensus

Advertisements

Model Projections

1. SVM

Model Projections

NOTE: The code that pulls Vegas game lines, may not be able to distinguish the double headers so the two listed lines may be the same and subsequently wrong for 1 of the games (Pulls the data based on team name).

All double headers are accounted for being 7 innings.

How to read the projections: The model type is at the very top. Matchups are denoted by the alternating white/green pairings, away teams are on top, home teams on bottom. The model analyzes the matchup and projects the home and away teams score (Proj Score). The difference between the home teams score and the away teams score give us what the model says the line should be (Proj Line). The “Line” column is what the Vegas line is at the time I run the model. The “Line Diff” is the difference in the projected line and the Vegas line. A positive line diff means the projected line is in the away teams favor compared to the Vegas line, negative means the projected line is in the home teams favor. The “Cover Prob” uses a normal distribution and the teams variance to project each teams probability to cover the listed Vegas line. Same thing for totals, “Proj Total” is the sum of the projected scores, “Line Total” is the listed Vegas total for the game, the difference between the two, and the probability to go over or stay under denoted by “O” and “U”.

*Added first 5 inning projections.

Betting edge

If you don’t understand what you are looking at, I recommend reading my post about betting tips. The percentages show the betting Edge, which is the cover prob (from above) minus the implied probability (-110 odds implied prob is 52.4%). If a team has a 92.4% chance to win and a 52.4% implied probability (or -110 odds), the Edge is 40%. The Edge alone doesn’t mean you should blindly bet it. To quantify, >35% is great value, 20-35% is really good, 10-20% is decent, <10 is ok value, blanks are negative value. ML is moneyline.

Advertisements

2. Adaptive Boosting (Ada Boost)

New Model Projections

NOTE: The code that pulls Vegas game lines, may not be able to distinguish the double headers so the two listed lines may be the same and subsequently wrong for 1 of the games (Pulls the data based on team name).

All double headers are accounted for being 7 innings.

How to read the projections: The model type is at the very top. Matchups are denoted by the alternating white/green pairings, away teams are on top, home teams on bottom. The model analyzes the matchup and projects the home and away teams score (Proj Score). The difference between the home teams score and the away teams score give us what the model says the line should be (Proj Line). The “Line” column is what the Vegas line is at the time I run the model. The “Line Diff” is the difference in the projected line and the Vegas line. A positive line diff means the projected line is in the away teams favor compared to the Vegas line, negative means the projected line is in the home teams favor. The “Cover Prob” uses a normal distribution and the teams variance to project each teams probability to cover the listed Vegas line. Same thing for totals, “Proj Total” is the sum of the projected scores, “Line Total” is the listed Vegas total for the game, the difference between the two, and the probability to go over or stay under denoted by “O” and “U”.

*Added first 5 inning projections.

Betting Edge

If you don’t understand what you are looking at, I recommend reading my post about betting tips. The percentages show the betting Edge, which is the cover prob (from above) minus the implied probability (-110 odds implied prob is 52.4%). If a team has a 92.4% chance to win and a 52.4% implied probability (or -110 odds), the Edge is 40%. The Edge alone doesn’t mean you should blindly bet it. To quantify, >35% is great value, 20-35% is really good, 10-20% is decent, <10 is ok value, blanks are negative value. ML is moneyline.

2 comments

  1. The consensus projections for CHW/BOS aren’t matching the excel spreads. For instance, Ada excel model predicts the under but Ada is marked for the over in the consensus O/U chart. Will check others now.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s