MLB Line Projections – 21 Jul 22

aerial view of sports stadium during daytime

These are statistical projections and shouldn’t be the only thing that factors in to betting on a team, the stats only tell part of the story. Keep an eye on injuries, COVID Issues, and players returning from injury.

Like what you see? Please subscribe or follow me on Twitter (@AnalyticsB2) for the latest news and post info. Want to support the statistical data or have suggestions for improvement, feel free to send me an email (B2SportsStats@gmail.com) or you can donate via the website, Venmo (@B2stats) or Cashapp ($B2stats).


Note: Some of the different websites I pull data from may show conflicting starting pitchers for a few games (i.e. one table may show “undecided” & another table will have a projected starter listed).

Notice: There will not be any MLB projections 14-22 August 2022.

Summary of Projections

Model Record

Model Rank

Consensus Record

Consensus Profitability

Model Consensus

Team Variance

Updated every Sunday.

What this chart is showing is each teams Variance & Standard Deviation (Std Dev). Variance and Std Dev are calculated from the season stats (need a larger sample size than 8 games). The list is ordered from lowest team variance to most team variance. The variance is how wide spread the data is, a team that scores between 1 and 2 runs every night will have very low variance, whereas a team that scores anywhere from 0 to 10 runs on a given night will have very high variance. The Std Dev is the square root of the variance and is a good measure for how consistent a team is… NOT how good or bad a team is, but how consistent they are. Std Dev shows the amount a teams score typically deviates from the average on a given night. The Pirates score (typically) will deviate about 2.49 runs from there average on a given night, where the Yankees score (typically) will deviate 3.89 runs from their average on a given night. Obviously the lower the Std Dev the easier it is for my models to project the score and provide higher probabilities.

Team Trends

Home/Away Records

[MOV = Margin of Victory] This chart shows a teams record at home vs on the road, both Moneyline and Over/Under record. This info is critical to identify teams to play or fade at home vs on the road. O/U record is displayed as “Overs-Unders-Ties.” Each of these also produces and expected win % based on each teams home vs away stats. Expected win % is a good way to adjust for those few lucky bounces that skew a teams records. The W% is the difference in win % and expected win %, a positive W% Diff means the team is actually better than their record indicates, a negative W% Diff indicates a team has actually been worse than their record indicates.

Records After a Day Off

[MOV = Margin of Victory] Another good trend to take advantage of is a team coming off a day of rest. Some teams are unstoppable with a day of rest, while others must have spent that day partying or something because they simply can’t win after a day off.

Run in the 1st Inning

Note: The error in my code today is preventing updates to the ERA numbers.

A popular betting option for some is a team to score a run in the first inning. Here is a breakdown of teams percentages of scoring in the first inning, broken down for total record and home vs away. Also included is the percent of runs allowed in the first inning by a given team. The “Prob of Scoring in 1st” column is calculated using the percent the Home/Away team scores in the vs vs the percent their opponent has allowed a run in the 1st Away/at Home. Of course this doesn’t tell the entire picture and starting pitcher has a lot to do with a team scoring in the first.

Starting Pitcher Expected Regression

This chart shows each of todays starting pitchers ERA along with expected and advanced stats. Basically, a Pitcher who has a higher ERA with lower FIP and Expected ERA (xERA), indicates the pitcher has performed better than what the stats indicate. Conversely, a pitcher with a low ERA but higher FIP & xERA indicate he has benefitted from a few “Lucky Bounces,” but has not been as good as his ERA indicates. ERA+ is a comparison of how their ERA compares to other pitchers in the league adjusted for for ballpark related factors.

FIP (Fielding Independent Pitching) is a way to project the pitchers ERA taking into account only what the pitcher can control. This is the most common way for stat nerds like me to see if a Pitcher is pitching better than or worse than traditional stats suggest. If the FIP is lower than the ERA, I expect the Pitcher to pitch better than past performances, if the FIP is higher than the ERA, I expect the pitcher to be worse than past performances. ERA Value is ERA minus FIP, Positive indicates FIP is better that ERA, negative means FIP is worse than ERA.

From ERA+ we can also get expected win percentage, which is good to look for expected regression too. For example, If a pitcher that is 4-0, but has an expected win % of 50%, they should probably be 2-2, but may have benefitted from a lot of run support or a bit of luck.

Expected ERA (xERA) is based on expected weighted on base percentage (xwOBA) and is converted to ERA form. It is formulated using exit velocity, launch angle and, on certain types of batted balls, Sprint Speed. It accounts for things like how hard batters are hitting a certain pitcher. The idea is if a pitcher gets hit incredibly hard but right to a fielder for an entire game, he is a bit lucky and his stats will look better than they should. Next game I would expect some of those hard hit balls to find gaps or go over the fence and the pitcher to perform worse. Conversely, a pitcher who gives up a lot of hits in a game from weak contact like bloop shots, dribblers down the line, etc, the xERA will be lower than the actual EAR and I would expect next game the batters don’t get as lucky on the weak contact finding holes to fall into.

Pitchers Trends

NOTE: Teams and starters are in a different order since this is from a different data source from the rest of the charts.

L28 stats show the data from the last 28 days, including the number of games over that time (Games L28) and ERA (L28 Day ERA). These are compared to the season average ERA and the “L28 Diff” column shows if the pitcher has performed better or worse over the last 28 days (Green and positive indicates better, Red and negative indicates worse than season average).

The Day/Night ERA automatically determines if a game is a day game or night game and shows the Pitchers ERA during those games (before 4:59PM local game time is considered a day game). Any Blanks indicate the value could not be found or the Pitcher has not played in a Day/Night game so far this season.

Home/Away ERA automatically determines if the team is home or away and pulls the pitchers ERA at Home or on the Road. Some pitchers have a larger difference in their performance at home vs on the road and this can provide a good betting edge.

Pitchers vs. Left/Right Hand Batters

The above shows pitchers performances vs left and right handed batters. The ERA numbers are estimates and calculations I had to do manually since I cant find a way to pull the ERA data (yet). The ERA numbers are very close to the actual Left/Right ERA splits, close enough that I use the data for reference when betting.

Bullpen Projected Runs

The Bullpen runs per 9 chart shows how many runs each model projects a given teams bullpen will give up over the course of 9 innings (essentially projected bullpen ERA). All of the bullpen stats are based on all non-starting pitchers for a given team and pulls stats from the last 12 games (same amount of games the normal model projections use). To project the bullpen stats, I use league average stats for all batting statistics vs that teams bullpen. The higher the number the more runs a bullpen is projected to give up and the worse they are. Because I use last 12 games some stats may be inflated (like giving up 20+ runs over 9 innings), but you get the point, they are bad. You may also see negative numbers for some model projections, that’s because they could have given up 0 runs against the best offenses and now they are going up against league average, so the model basically thinks it will do better against a worse offense and doesn’t know the number of runs cant be less than 0. Fading bad bullpens is a great in-play opportunity; when a bad bullpen team has the lead and they pull the starter, making an ML play on the other side can be very profitable.

Model Projections

My Model Choice:

Random Forest

Undecided Pitchers default to league average stats.

How to read the projections: The model type is at the very top. Matchups are denoted by the alternating white/green pairings, away teams are on top, home teams on bottom. The model analyzes the matchup and projects the home and away teams score (Proj Score). The difference between the home teams score and the away teams score give us what the model says the line should be (Proj Line). The “Line” column is what the Vegas line is at the time I run the model. The “Line Diff” is the difference in the projected line and the Vegas line. A positive line diff means the projected line is in the away teams favor compared to the Vegas line, negative means the projected line is in the home teams favor. The “Cover Prob” uses a normal distribution and the teams variance to project each teams probability to cover the listed Vegas line. Same thing for totals, “Proj Total” is the sum of the projected scores, “Line Total” is the listed Vegas total for the game, the difference between the two, and the probability to go over or stay under denoted by “O” and “U”.

Betting Edge

If you don’t understand what you are looking at, I recommend reading my post about betting tips. The percentages show the betting Edge, which is the cover prob (from above) minus the implied probability (-110 odds implied prob is 52.4%). If a team has a 92.4% chance to win and a 52.4% implied probability (or -110 odds), the Edge is 40%. The Edge alone doesn’t mean you should blindly bet it. To quantify, >35% is great value, 20-35% is really good, 10-20% is decent, <10 is ok value, blanks are negative value. ML is moneyline.

Top Performing Models:

1. k-Nearest Neighbor (kNN)

Undecided Pitchers default to league average stats.

How to read the projections: The model type is at the very top. Matchups are denoted by the alternating white/green pairings, away teams are on top, home teams on bottom. The model analyzes the matchup and projects the home and away teams score (Proj Score). The difference between the home teams score and the away teams score give us what the model says the line should be (Proj Line). The “Line” column is what the Vegas line is at the time I run the model. The “Line Diff” is the difference in the projected line and the Vegas line. A positive line diff means the projected line is in the away teams favor compared to the Vegas line, negative means the projected line is in the home teams favor. The “Cover Prob” uses a normal distribution and the teams variance to project each teams probability to cover the listed Vegas line. Same thing for totals, “Proj Total” is the sum of the projected scores, “Line Total” is the listed Vegas total for the game, the difference between the two, and the probability to go over or stay under denoted by “O” and “U”.

Betting edge

If you don’t understand what you are looking at, I recommend reading my post about betting tips. The percentages show the betting Edge, which is the cover prob (from above) minus the implied probability (-110 odds implied prob is 52.4%). If a team has a 92.4% chance to win and a 52.4% implied probability (or -110 odds), the Edge is 40%. The Edge alone doesn’t mean you should blindly bet it. To quantify, >35% is great value, 20-35% is really good, 10-20% is decent, <10 is ok value, blanks are negative value. ML is moneyline.

Leave a Reply

%d bloggers like this: