Jump to content

FAQ and Forum on Advanced Stats


witesoxfan
 Share

Recommended Posts

I would like to start a discussion about working baserunning statistics into other more commonly used stats, rather than a conversion to Runs and ultimately WAR.

 

Specifically, I'd like to see baserunning applied to slugging percentage, a stat that resonates across generations. The idea is this: A single plus a stolen base is a certain % of a Double, a double plus a stolen base is a certain % of a triple, etc. Neglecting the effects of pitchers' pick-off throws, distraction to pitchers/distraction to hitters that a basestealing threat causes, the factors I've thought of so far that would determine the percentages are as follows:

 

1. The value of a stolen base after a single is less valuable than a double because a single moves baserunners less than a double, therefore yielding fewer RBI. How much value does this remove?

2. The value of a stolen base after a single is less valuable than a double also because the stolen base always happens later in the inning than the single did. This brings in a different type of base stealing efficiency in that not only does the success % matter, but also how many pitches to home plate were required before the SB took place. Just as a leadoff double is more likely to cross home plate than a two out double, the lesser the portion of the inning that transpires during the SB process the more valuable the SB is. This factor can easily be determined by looking at a single pitch as a typical portion of an inning. A single pitch is probably something like 1/18th or 1/20th, 5 or 5.5%

3. Similarly, how many outs are left in the inning when the SB or CS takes place, affects the value of each.

4. How to account for SB's added to a walk, within this framework.

 

And there's countless other factors, that I undoubtedly haven't considered.

 

Like many of you, I'll ballpark ratios in my head, such as K/BB for pitchers just by eyeballing the raw core numbers. Hypothetically, let's say Eaton steals 40 bases this year and gets caught 10 times for a clean 80%. 20 successful steals while being caught 10 times would have put him at the 66% threshold (roughly 2/3rds), leaving 20 SB's to the positive. Multiplying that 20 by the percentage determined from the factors described above yields a number that can be added to total bases, and an easy modification of slugging.

 

No solutions here, just spitballin'. And if all this exists already, well, I'm a goof :P , but I want the saber movement to evolve toward the most accessible stats whenever possible.

Edited by Stan Bahnsen
Link to comment
Share on other sites

  • Replies 240
  • Created
  • Last Reply

Top Posters In This Topic

QUOTE (Stan Bahnsen @ Mar 13, 2014 -> 09:29 AM)
I would like to start a discussion about working baserunning statistics into other more commonly used stats, rather than a conversion to Runs and ultimately WAR.

 

Specifically, I'd like to see baserunning applied to slugging percentage, a stat that resonates across generations. The idea is this: A single plus a stolen base is a certain % of a Double, a double plus a stolen base is a certain % of a triple, etc. Neglecting the effects of pitchers' pick-off throws, distraction to pitchers/distraction to hitters that a basestealing threat causes, the factors I've thought of so far that would determine the percentages are as follows:

 

1. The value of a stolen base after a single is less valuable than a double because a single moves baserunners less than a double, therefore yielding fewer RBI. How much value does this remove?

2. The value of a stolen base after a single is less valuable than a double also because the stolen base always happens later in the inning than the single did. This brings in a different type of base stealing efficiency in that not only does the success % matter, but also how many pitches to home plate were required before the SB took place. Just as a leadoff double is more likely to cross home plate than a two out double, the lesser the portion of the inning that transpires during the SB process the more valuable the SB is. This factor can easily be determined by looking at a single pitch as a typical portion of an inning. A single pitch is probably something like 1/18th or 1/20th, 5 or 5.5%

3. Similarly, how many outs are left in the inning when the SB or CS takes place, affects the value of each.

4. How to account for SB's added to a walk, within this framework.

 

And there's countless other factors, that I undoubtedly haven't considered.

 

Like many of you, I'll ballpark ratios in my head, such as K/BB for pitchers just by eyeballing the raw core numbers. Hypothetically, let's say Eaton steals 40 bases this year and gets caught 10 times for a clean 80%. 20 successful steals while being caught 10 times would have put him at the 66% threshold (roughly 2/3rds), leaving 20 SB's to the positive. Multiplying that 20 by the percentage determined from the factors described above yields a number that can be added to total bases, and an easy modification of slugging.

 

No solutions here, just spitballin'. And if all this exists already, well, I'm a goof :P , but I want the saber movement to evolve toward the most accessible stats whenever possible.

 

Very interesting ideas. Regarding #1 and #3, they usually get around this by using linear weights. So the value of the extra base is the average run value of the event in all possible situations, weighted by frequency of occurrence. I would guess that the difference in baserunning movement between a single and a double is negligible. RUnners in scoring position will score anyway, except in a few instances of a short or infield single, and scoring from first on a double is probably similarly rare. Further mitigating factors would be the amount of "stress" added to the pitcher (your factor #2) for having to worry about the runner on first before he steals. No idea how to quantify this, but I'd guess that the combination would make the gap between the single/SB and the double much smaller, maybe even negligible.

 

I'm wondering if a simple "start" to something like this would be to add SB to a players Total Bases, and then calculate SLG as you normally would. This new Stan Bahnsen Slugging Percentage (sbSLG) wouldn't totally account for the #1 factor, but if I'm right that the difference is small, it might be a good approximation. As good, at least, as OPS as a proxy for wOBA, I'd think.

 

Link to comment
Share on other sites

QUOTE (Eminor3rd @ Mar 13, 2014 -> 10:20 AM)
Very interesting ideas. Regarding #1 and #3, they usually get around this by using linear weights. So the value of the extra base is the average run value of the event in all possible situations, weighted by frequency of occurrence. I would guess that the difference in baserunning movement between a single and a double is negligible. RUnners in scoring position will score anyway, except in a few instances of a short or infield single, and scoring from first on a double is probably similarly rare. Further mitigating factors would be the amount of "stress" added to the pitcher (your factor #2) for having to worry about the runner on first before he steals. No idea how to quantify this, but I'd guess that the combination would make the gap between the single/SB and the double much smaller, maybe even negligible.

 

I'm wondering if a simple "start" to something like this would be to add SB to a players Total Bases, and then calculate SLG as you normally would. This new Stan Bahnsen Slugging Percentage (sbSLG) wouldn't totally account for the #1 factor, but if I'm right that the difference is small, it might be a good approximation. As good, at least, as OPS as a proxy for wOBA, I'd think.

Pitchers throws to 1B for "keeping runners close" vs. "real pickoff attempt" also intrigues me. Should the hard throw required for the "real pickoff attempt" be added to pitch count, or at least part of it? Also, the "real pickoff attempt" seems to be a high-error-% play, with the 1B trying to catch an errant throw while the baserunner dives back toward the base. And it's often a two-base error as the ball heads into right field. But perhaps the 2006 WS is still too fresh in my mind.

 

The more you drill down, the more you realize how complex the game is. Which is neat for us geeks.

Edited by Stan Bahnsen
Link to comment
Share on other sites

I've thought about this kind of thing before too. You'd have to take the time on base - whether a hit, walk, or HBP - out of the equation and remove it from OBP if the player is thrown out. It makes some sense in that you don't want to steal bases on a home run hitting team - scoring from 1st counts the same as scoring from 3rd - but you will on teams more dependent upon singles and doubles.

Link to comment
Share on other sites

  • 2 weeks later...
  • 4 weeks later...
  • 2 weeks later...

I was thinking about Flowers' season while driving and wanted to ask you guys...how is line drive percentage constituted? Is it something like UZR where eyes are used to come up with the stat or is there actually a way it's calculated?

 

Thanks in advance.

Link to comment
Share on other sites

QUOTE (Rowand44 @ Apr 30, 2014 -> 03:00 AM)
I was thinking about Flowers' season while driving and wanted to ask you guys...how is line drive percentage constituted? Is it something like UZR where eyes are used to come up with the stat or is there actually a way it's calculated?

 

Thanks in advance.

It's based on people physically recording the data based on their interpretation. As a result it is susceptible to some inaccuracies, particularly on the classification between some fly balls and line drives. Fangraps' batted ball data comes from Baseball Info Solutions who use video feeds for their data collection. MLB uses stringers at the ballparks, as does STATS. As Colin Wyers notes, both BIS and STATS are professional data providers, so should be more accurate than MLB Gameday.

Link to comment
Share on other sites

QUOTE (Ozzie Ball @ Apr 29, 2014 -> 09:36 PM)
It's based on people physically recording the data based on their interpretation. As a result it is susceptible to some inaccuracies, particularly on the classification between some fly balls and line drives. Fangraps' batted ball data comes from Baseball Info Solutions who use video feeds for their data collection. MLB uses stringers at the ballparks, as does STATS. As Colin Wyers notes, both BIS and STATS are professional data providers, so should be more accurate than MLB Gameday.

That's what I kind of figured. Thanks!

Link to comment
Share on other sites

  • 2 weeks later...
QUOTE (Eminor3rd @ May 12, 2014 -> 12:35 PM)
Alex Rios and improved plate discipline: http://msn.foxsports.com/mlb/story/batters...et-older-051214

 

Excellent example of why plate discipline helps hitting, not just walking.

Does plate discipline have anything to do with swing at a curve ball a foot outside and in the dirt for strike 3. If it was once in a while but there is more than one that can't lay off those pitches .Can anyone tell them how to recognize those pitches.

 

Link to comment
Share on other sites

QUOTE (sammy esposito @ May 13, 2014 -> 04:53 AM)
Does plate discipline have anything to do with swing at a curve ball a foot outside and in the dirt for strike 3. If it was once in a while but there is more than one that can't lay off those pitches .Can anyone tell them how to recognize those pitches.

 

With 2 strikes, that's an incredibly difficult pitch to lay off. Good ones look like fastballs coming in and all you can try and do is foul it off.

 

My dad says this same kind of stuff. All I can tell him is to step into a major league batter's box and try it for himself. It's not easy.

Link to comment
Share on other sites

QUOTE (sammy esposito @ May 13, 2014 -> 04:53 AM)
Does plate discipline have anything to do with swing at a curve ball a foot outside and in the dirt for strike 3. If it was once in a while but there is more than one that can't lay off those pitches .Can anyone tell them how to recognize those pitches.

 

Indeed, a good two strike approach is difficult, especially against sliders. I think the idea is to be smart about pitch recognition early in the count and thus avoid being at the mercy of the pitcher's whim later.

Link to comment
Share on other sites

  • 3 weeks later...
  • 2 weeks later...
  • 1 month later...

Saw this the other day and I think it exemplifies exactly why advanced statistics are a worthwhile venture. Quote is attributed to Bill James - "a muddy truth is better than a tidy lie"

Link to comment
Share on other sites

  • 1 month later...

I might just post this article in like 3 different spots. I just started reading it and it's so relevant to so much that we've talked about on here that it feels like someone on here asked Dave Cameron about Alex Gordon and he wrote this article.

 

http://www.fangraphs.com/blogs/so-lets-tal...ut-alex-gordon/

 

Specifically:

 

Passan, it should be noted, is arguing against a strawman, since I haven’t seen a single person argue that Alex Gordon is “the best player in baseball this year.” For one, even if you used WAR as the sole basis for determining “best player in baseball” — and you shouldn’t do that — then the answer would be Felix Hernandez (+6.2 WAR, a half-win ahead of Gordon), so the most aggressive argument you could make is that WAR has Gordon as the best position player so far.

 

But really, even that is a far too aggressive interpretation, since no one has ever rationally argued that WAR is precise to the decimal point. The reality is that WAR has always been best used for grouping players of similar levels of contribution, not for arguing that a 0.1 WAR difference means that Player X is having a better year than Player Y. No one actually argues for using WAR as a precise tool to measure minuscule differences. I’d suggest that what WAR is actually saying is that Alex Gordon, so far, is having one of the best seasons of any position player in baseball this year, and I don’t think that statement is at all absurd.

 

...

 

And the reality is that one of the primary reasons why offensive statistics are more reliable is simply because the samples are larger. Over the course of a season, an everyday player will bat 600 to 700 times, allowing much of the small sample variance to wash out in the end. On the other hand, even a very good left fielder like Gordon averages about 300 putouts per year, and most of those are routine plays that any ambulatory Major Leaguer could have made, so they have no real effect on his defensive rating.

 

...

 

There is absolutely an argument to be made that Gordon’s UZR may be incorrect — though interestingly, people only ever seem to assume that numbers are too extreme, ignoring the possibility that the measurement error could also mean that his defensive rating might be too low — and if you were trying to answer the question of who “the best player in baseball” is, you’d definitely want to use multi-season regressed defensive numbers. But even using those kinds of calculations, there’s no way to get Alex Gordon out of the top 5-10 position players in MLB this year. The only “absurd” argument would be that Gordon hasn’t been one of the best players in baseball this year.

 

 

Link to comment
Share on other sites

  • 2 weeks later...
QUOTE (venom4789 @ Aug 30, 2014 -> 02:44 PM)
thats what i get for using espn. but why would they have different scores.

 

Because there are different statistics to calculate WAR. ESPN is typically considered to be the worst calculation. FanGraphs.com is the best and baseballreference.com is second best.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.

×
×
  • Create New...