Entering now into part four of my preseason player valuation series, we arrive at one of the more important decisions of the preseason: deciding which projection system(s) to use. Evaluating projection systems is well-trodden ground, as documented by Will Larson over at the Baseball Projection Project. This year, I've only yet seen analysis of the 2015 projections over at Beyond the Box Score, and in that case it was not a specifically fantasy-focused analysis. Each of the projection systems changes and iterates their methodology year over year, and so we can always stand to learn more by analyzing the most recent year's results.
One thing I realized in the course of completing this study - the Big Board needed more ability to customize! As of v1.09 this week, you can directly apply the findings of this study in the Big Board, with the ability to use a different projection system for hitters and for pitchers, as well as choose any projection system as your source for playing time.
In this study, I'll focus on the most commonly used, freely available projections - the same ones that appear in the Big Board: Steamer, ZiPS, Fangraphs Depth Charts, Fangraphs Fans, the Fantasy411 Composite, and Clay Davenport projections. The categories of interest are the typical 5x5 categories, except for saves, which are not projected by every system. I'll look at each system's ability to project these categories in total as well as on a per-PA/per-IP basis. Since for fantasy purposes we only care about the relative projections made by each system (ie, we only need to know Kershaw is the best SP in baseball, not exactly what his ERA will be), I'll use R2 to evaluate how well the projections correlated to actual results rather than any measure of absolute error. The most common fantasy leagues draft about 300 players, broken out into 180 hitters, 90 SP, and 30 RP, and so I've gone through each system to find the consensus top 300 players as projected in the 2015 preseason, and will only be evaluating the systems based on their projections of those 300 players.
First are the hitter projections, more specifically the analysis of the projected top-180 hitters for 2015. In addition to each of the standard projections, I've included a custom mix which I'll call 'Zeamer' - 67% Steamer, 33% ZiPS. You'll also note the Fantasy411 composite is abbreviated here as 'Comp'. Rate stats like AVG were also evaluated as part of the 'total' projections by using a playing-time weighted value indicated by an 'n' (e.g. "nAVG").
R2 values, hitters
Ranked R2 values, hitters
Starting with playing time, nearly every projection system struggled. Between injuries, lineup spots, and role changes, playing time is just plain difficult to peg. However, the hand-curated playing time over at Fangraphs clearly helped the Steamer and FGDepth systems achieve notably better playing time. As far as I know, these systems use the same basis for playing time, with slight adjustments made by Steamer that apparently lead to a noticeable improvement. Steamer wins for hitter playing time. I was able to eke out a slight edge for 'Zeamer' in playing time by using the Clay projection for the top-20 hitters. Clay appeared to have less of a playing time penalty for injury risks, and therefore did a better job of projecting Harper's 654 PA (Clay: 627, Steamer: 560), or EE's 624 PA (Clay: 587, Steamer: 548).
Homers and steals were the easiest offensive categories to project, with no system getting much of an edge on a per-PA basis. Steals are obviously assisted by the low/no-speed guys who are accurately projected for nearly no steals. Runs were very hard to project, with the Comp projections edging the others out significantly, while RBI were projected relatively well by all systems. Average is understandably difficult to project given the year-to-year fluctuation in that category, and all systems had R2 between .23-.27. The net result (represented in the 'Total' column to the right of both the regular stats and per-PA stats) is evaluated as the average R2 value across the five categories, and is plotted below.
It's good to see that some of the common fantasy baseball adages are true - in the case of hitters, the composite projection beats all the others on a per-PA basis. However, it's not by much, and the relative simplicity of using the Steamer/ZiPS combo is attractive. Zeamer achieves a slightly better R2 than FGDepth, but it's unsurprising that they're similar since FGDepth is just a 50-50 Steamer/ZiPS combo (vs. Zeamer's 67-33). When looking at the total stats in yellow, you'll notice that ZiPS does worse, essentially because Dan's system unapologetically gives little effort to project playing time accurately. Zeamer comes away with the best overall results, but for the relatively meager gain in accuracy, it seems you might as well simply use Steamer.
Verdict: Zeamer wins out, a 67-33 combo of Steamer-ZiPS, with Clay playing time for the top-20 and Steamer playing time for the rest. But if you're in a hurry, the straight FGDepth projections will be 99% as good.
Mythbusting: One of the most common armchair analysis comments I see is 'Use ZiPS for the top-20' due to a perception that steamer does less well projecting extremes or somehow over-regresses guys. Looking only at the top-50, playing time actually becomes nearly impossible to predict, and Steamer actually rises up to be the best predictor of per-PA stats. Narrowing down to the top-20, Steamer pulls further ahead! So no, myth *busted*, do not use ZiPS for the upper tier of hitters.
R2 values, top-50 hitters
Due to the fact that SPs and RPs are projected so differently, I'll analyze them completely separately. Here we have the analysis of the top-90 projected SPs in 2015. As with the hitter projections, weighted rate stats will be indicated by an 'n' (e.g. "nERA"). I'll also include a custom mix, in this case called 'Steamay' for a combination of Steamer rate-stats and Clay's playing time.
R2 values, starting pitchers
Ranked R2 values, starting pitchers
Projecting playing time for pitchers is even more of a struggle than it is for hitters, understandably. Somehow, Clay Davenport's playing time projections blew the others out of the water. It appears to be achieved by projecting a) less innings in general for non-elite pitchers, and b) less of a spread in innings between mid-range pitchers. Again I found a slight edge for Steamay by using Steamer for the top-10 SPs, and Clay for the rest.
Steamer really was the unquestioned champion of pitcher projections. Strikeouts were relatively easy for all systems to project, and wins were very hard, but Steamer managed to beat the others by significant margins in wins, ERA, and WHIP. Again, the net results are plotted below:
This time the composite projections fall short, and I was surprised to see that Steamer really beat each of the others by a fairly wide margin. Various combos of Steamer/ZiPS were attempted but in this case I found basically no advantage to be gained from including ZiPS in the projection. In this case I have to declare Steamay the champion, since the advantage gained from using Clay's playing-time is significant, but the steamer rate-stats should certainly be the starting point for any projection you come up with.
Verdict: Steamay is the champion by a wide margin, using steamer rate stats, with Steamer playing time for the top-10 and Clay playing time for the rest.
Mythbusting, pt 2: Again - no, ZiPS was not better at projecting the top-20, or top-40. The top-40 were generally a bit harder to project, with Steamer only barely beating out the Fans projection. But, playing time was easier to project, and so Steamer remained in the top spot by a good margin (Steamay excepted).
R2 values, top-40 starting pitchers
Onward to relievers, here we have the analysis of the top-30 projected RPs in 2015.
R2 values, relief pitchers
Ranked R2 values, relief pitchers
Clay's playing time advantage disappears for relievers. Interestingly enough, Steamer's "eff it" projection of just giving 65 IP to every closer has the same R2 as Clay, who actually tried to predict different numbers for every RP. This time there's no edge by mixing systems so far as I can tell.
Steamer remains the champion of rate-stats here, but it's not a huge victory. Strikeouts were even easier to project than they were for starters, but W, ERA, and WHIP were basically completely impossible to predict. The basic conclusion would seem to be that you should pay for K's when drafting relievers, and nothing else (of course, Saves are also a good thing). Again, the net results are plotted below:
Verdict: Stick with Steamay or Steamer, but it basically won't make much difference which system you use. The most important thing might be finding a good source for Saves projections.