MLB Line Projections – 18 June 21

These are statistical projections and shouldn’t be the only thing that factors in to betting on a team, the stats only tell part of the story. Keep an eye on injuries, COVID Issues, and players returning from injury.

Like what you see? Please subscribe or follow me on Twitter (@AnalyticsB2) for the latest news and post info. Want to support the statistical data or have suggestions for improvement, feel free to send me an email (B2SportsStats@gmail.com) or you can donate via the website, Venmo (@B2stats) or Cashapp ($B2stats).


All double headers are accounted for being 7 innings.

I will post my top model that I use daily (RF), along with the top 2 performing models.

Pitchers, Spider Tack, & Spin Rate

For those not following lately, MLB has faced some controversy lately about pitchers using a sticky substance called Spider Tack to increase spin rate of their pitches and as a result increase the movement of the ball making it harder for batters to hit the ball. As this blew up the last couple of weeks Pitchers have reportedly been reluctant to use the substance and some of their respective spin rates (and in some cases performance) has fallen off. The decrease in performance could be tied to a player using the substance or could be due to the player handling the attention poorly, either way it gives a betting edge in my opinion. This is pure speculation on my part, but if a great pitcher is now average because they stopped using the “secret stuff” then that gives me a betting edge and I will take full advantage of it.

So here is a list of some players that have been tied to a decrease in spin rate since the controversy hit a turning point and some comments on recent performance.

  • Trevor Bauer (LAD) – 5.11 ERA in June (Small sample size of 2 games though)
  • Brandon Woodruff (MIL) – 4.76 ERA in June (3 games).
  • Corbin Burnes (MIL) – Huge dip in spin rate and a dip in stats, but could just be a regression to the mean after a stellar deGrom-esk start to the season.
  • Nick Pivetta (BOS) – Not stellar stats to begin with, but some regression. 5.74 ERA in June (3 games).
  • Gerrit Cole (NYY) – 4.26 ERA in June (3 Games).
  • Shane Bieber (CLE) – No major decrease in performance except the game he got injured.
  • E. Rodriguez (BOS) – 8.36 ERA in June (3 starts).
  • Rumored starters: Peralta (MIL), Eovaldi (BOS), Wainwright (STL), Fried (ATL), Kuhl (PIT), Glasnow (TBR), Rodon (CHW), Kluber (NYY)…

Another thing to keep in mind, this can really affect the Over/Under in games. More players are using it than we know and if they stop and start giving up more runs, it can be an over bettors dream.

Why does it Matter? Check out this SI Article or this TheScore Article on how it could affect a game.

Summary of Projections

Model Record

Model Rank

Consensus Record

Consensus Profitability

Model Consensus

Sharp Report

ML – Some Sharp money has come in on the Yankees, Astros, & D-Backs.

O/U – Some Sharp money has come in on the Blue Jays/Orioles Over 9.5, Cardinals/Braves Over 9, White Sox/Astros Under 9, & Dodgers/D-Backs Over 9.

Team Variance

Updated every Sunday.

What this chart is showing is each teams Variance & Standard Deviation (Std Dev). Variance and Std Dev are calculated from the season stats (need a larger sample size than 8 games). The list is ordered from lowest team variance to most team variance. The variance is how wide spread the data is, a team that scores between 1 and 2 runs every night will have very low variance, whereas a team that scores anywhere from 0 to 10 runs on a given night will have very high variance. The Std Dev is the square root of the variance and is a good measure for how consistent a team is… NOT how good or bad a team is, but how consistent they are. Std Dev shows the amount a teams score typically deviates from the average on a given night. The Yankees score (typically) will deviate about 2.34 runs from there average on a given night, where the Reds score (typically) will deviate 3.72 runs from their average on a given night. Obviously the lower the Std Dev the easier it is for my models to project the score and provide higher probabilities.

Starting Pitcher Expected Regression

Note: Not all pitchers are in the data source used for xERA resulting in an #Value! error.

This chart shows each of todays starting pitchers ERA along with expected and advanced stats. Basically, a Pitcher who has a higher ERA with lower FIP and Expected ERA (xERA), indicates the pitcher has performed better than what the stats indicate. Conversely, a pitcher with a low ERA but higher FIP & xERA indicate he has benefitted from a few “Lucky Bounces,” but has not been as good as his ERA indicates. ERA+ is a comparison of how their ERA compares to other pitchers in the league adjusted for for ballpark related factors.

FIP (Fielding Independent Pitching) is a way to project the pitchers ERA taking into account only what the pitcher can control. This is the most common way for stat nerds like me to see if a Pitcher is pitching better than or worse than traditional stats suggest. If the FIP is lower than the ERA, I expect the Pitcher to pitch better than past performances, if the FIP is higher than the ERA, I expect the pitcher to be worse than past performances. ERA Value is ERA minus FIP, Positive indicates FIP is better that ERA, negative means FIP is worse than ERA.

In short, ERA+ essentially adjusts for the ballpark and compares the pitcher to the league average. ERA+ of 100 indicates the league average after adjusting for ballpark, above 100 indicates better than average, less than 100 indicates worse than average. To make sense of what’s displayed, the ERA+ Percentage shows how much better (or worse if negative number) the starter’s ERA is compared to the league average ERA, assuming all ballparks are the exact same. The Percentages may seem odd at first when you see a guy is 200% better or worse than the league average, so if you need some additional help understanding Here is a short, quick, easy to understand explanation. This is a good way to factor in something like a Rockies Pitcher who typically pitches in Home Run City (Denver, Colorado), but today is on the road in a more average ballpark. From ERA+ we can also get expected win percentage, which is good to look for expected regression too. for example, If a pitcher that is 4-0, but has an expected win % of 50%, they should probably be 2-2, but may have benefitted from a lot of run support or a bit of luck.

Expected ERA (xERA) is based on expected weighted on base percentage (xwOBA) and is converted to ERA form. It is formulated using exit velocity, launch angle and, on certain types of batted balls, Sprint Speed. It accounts for things like how hard batters are hitting a certain pitcher. The idea is if a pitcher gets hit incredibly hard but right to a fielder for an entire game, he is a bit lucky and his stats will look better than they should. Next game I would expect some of those hard hit balls to find gaps or go over the fence and the pitcher to perform worse. Conversely, a pitcher who gives up a lot of hits in a game from weak contact like bloop shots, dribblers down the line, etc, the xERA will be lower than the actual EAR and I would expect next game the batters don’t get as lucky on the weak contact finding holes to fall into.

Bullpen Projected Runs

The Bullpen runs per 9 chart shows how many runs each model projects a given teams bullpen will give up over the course of 9 innings (essentially projected bullpen ERA). All of the bullpen stats are based on all non-starting pitchers for a given team and pulls stats from the last 12 games (same amount of games the normal model projections use). To project the bullpen stats, I use league average stats for all batting statistics vs that teams bullpen. The higher the number the more runs a bullpen is projected to give up and the worse they are. Because I use last 12 games some stats may be inflated (like giving up 20+ runs over 9 innings), but you get the point, they are bad. You may also see negative numbers for some model projections, that’s because they could have given up 0 runs against the best offenses and now they are going up against league average, so the model basically thinks it will do better against a worse offense and doesn’t know the number of runs cant be less than 0. Fading bad bullpens is a great in-play opportunity; when a bad bullpen team has the lead and they pull the starter, making an ML play on the other side can be very profitable.

Model Projections

My Model Choice:

Random Forest

Undecided Pitchers default to league average stats.

For Pitchers with inflated stats (i.e. other team projected to score something like 20 runs), I use the expected stats table to adjust for the inflated projections.

No line up on the BOS/KCR game and total on the MIA/CHC game

How to read the projections: The model type is at the very top. Matchups are denoted by the alternating white/green pairings, away teams are on top, home teams on bottom. The model analyzes the matchup and projects the home and away teams score (Proj Score). The difference between the home teams score and the away teams score give us what the model says the line should be (Proj Line). The “Line” column is what the Vegas line is at the time I run the model. The “Line Diff” is the difference in the projected line and the Vegas line. A positive line diff means the projected line is in the away teams favor compared to the Vegas line, negative means the projected line is in the home teams favor. The “Cover Prob” uses a normal distribution and the teams variance to project each teams probability to cover the listed Vegas line. Same thing for totals, “Proj Total” is the sum of the projected scores, “Line Total” is the listed Vegas total for the game, the difference between the two, and the probability to go over or stay under denoted by “O” and “U”.

Betting Edge

If you don’t understand what you are looking at, I recommend reading my post about betting tips. The percentages show the betting Edge, which is the cover prob (from above) minus the implied probability (-110 odds implied prob is 52.4%). If a team has a 92.4% chance to win and a 52.4% implied probability (or -110 odds), the Edge is 40%. The Edge alone doesn’t mean you should blindly bet it. To quantify, >35% is great value, 20-35% is really good, 10-20% is decent, <10 is ok value, blanks are negative value. ML is moneyline.

Top Performing Models:

1. Neural Net

Undecided Pitchers default to league average stats.

For Pitchers with inflated stats (i.e. other team projected to score something like 20 runs), I use the expected stats table to adjust for the inflated projections.

No line up on the BOS/KCR game and total on the MIA/CHC game

How to read the projections: The model type is at the very top. Matchups are denoted by the alternating white/green pairings, away teams are on top, home teams on bottom. The model analyzes the matchup and projects the home and away teams score (Proj Score). The difference between the home teams score and the away teams score give us what the model says the line should be (Proj Line). The “Line” column is what the Vegas line is at the time I run the model. The “Line Diff” is the difference in the projected line and the Vegas line. A positive line diff means the projected line is in the away teams favor compared to the Vegas line, negative means the projected line is in the home teams favor. The “Cover Prob” uses a normal distribution and the teams variance to project each teams probability to cover the listed Vegas line. Same thing for totals, “Proj Total” is the sum of the projected scores, “Line Total” is the listed Vegas total for the game, the difference between the two, and the probability to go over or stay under denoted by “O” and “U”.

Betting edge

If you don’t understand what you are looking at, I recommend reading my post about betting tips. The percentages show the betting Edge, which is the cover prob (from above) minus the implied probability (-110 odds implied prob is 52.4%). If a team has a 92.4% chance to win and a 52.4% implied probability (or -110 odds), the Edge is 40%. The Edge alone doesn’t mean you should blindly bet it. To quantify, >35% is great value, 20-35% is really good, 10-20% is decent, <10 is ok value, blanks are negative value. ML is moneyline.

2. Support Vector Machine (SVM)

Undecided Pitchers default to league average stats.

For Pitchers with inflated stats (i.e. other team projected to score something like 20 runs), I use the expected stats table to adjust for the inflated projections.

No line up on the BOS/KCR game and total on the MIA/CHC game

How to read the projections: The model type is at the very top. Matchups are denoted by the alternating white/green pairings, away teams are on top, home teams on bottom. The model analyzes the matchup and projects the home and away teams score (Proj Score). The difference between the home teams score and the away teams score give us what the model says the line should be (Proj Line). The “Line” column is what the Vegas line is at the time I run the model. The “Line Diff” is the difference in the projected line and the Vegas line. A positive line diff means the projected line is in the away teams favor compared to the Vegas line, negative means the projected line is in the home teams favor. The “Cover Prob” uses a normal distribution and the teams variance to project each teams probability to cover the listed Vegas line. Same thing for totals, “Proj Total” is the sum of the projected scores, “Line Total” is the listed Vegas total for the game, the difference between the two, and the probability to go over or stay under denoted by “O” and “U”.

Betting edge

If you don’t understand what you are looking at, I recommend reading my post about betting tips. The percentages show the betting Edge, which is the cover prob (from above) minus the implied probability (-110 odds implied prob is 52.4%). If a team has a 92.4% chance to win and a 52.4% implied probability (or -110 odds), the Edge is 40%. The Edge alone doesn’t mean you should blindly bet it. To quantify, >35% is great value, 20-35% is really good, 10-20% is decent, <10 is ok value, blanks are negative value. ML is moneyline.

Leave a Reply

%d bloggers like this: