Introducing CHL statistics

It’s hard to do any kind of stat-based analysis of CHL players, including NHL draft prospects, since the only official statistics available are goals, assists, points, and penalty minutes. However, the three leagues do provide raw game-by-game data which means we can calculate more stats and make some assumptions to estimate more still. And that’s what I’ve done: Extra Skater now has statistics for CHL players for 2013-14. Check out the stats here or read on for details.

Stats available

Update 2014-06-23: I’ve removed all-situations on-ice stats because non-EV on-ice information is recorded too inconsistently across leagues to be useful

2014 NHL draft prospects

Choose ’17’ from the ‘age’ drop-down filter to show players eligible for the 2014 NHL Entry Draft. (Draft eligibility is a little more complicated than that but all the top CHL prospects are here.)

Interesting reports

Estimating time on ice

The CHL doesn’t track time on ice, so to use this metric we have to estimate it using the data they do make available. Luckily, many people have done this sort of thing including Iain Fyffe, Scott Reynolds, Eric T., and Rhys J/Josh Weissbock. The underlying assumption is that a player’s share of his team’s ice time is about the same as the share of goals he’s on the ice for. In other words, if a player is on the ice for 40% of the goals scored by or against his team, we assume he was on the ice for about 40% of the minutes (i.e. 24 minutes/60). This turns out to work fairly well.

I’ve refined this approach by separating the strengths (even strength, power play, and shorthanded), tracking how much time each team plays at each strength, and adjusting for time spent serving penalties. It’s by no means a perfect approach but it produces reasonable estimates (see the next section).

One important caveat about estimating time on ice for special teams: in the WHL, no arena records which players are on the ice for power play goals, and only 14 OHL arenas do (all in the QMJHL do). In the cases where we don’t have on-ice skater information, I’ve estimated time on ice based on players involved in goals as a scorer/assister. This method is very rough, especially for estimating shorthanded time on ice since there are so few shorthanded goals, and likely assigns more ice time than warranted to players involved in goals rather than just on the ice for them.

Accuracy of estimated metrics

To check the accuracy of time on ice estimates I used the same estimation method on NHL data. I chose 25 skaters with a range of ice time, from Colton Orr to Ryan Suter, and estimated their ice time using only data that would be available in the CHL (the QMJHL, to be specific, with all goals having on-ice players recorded). The results:

Graph of NHL TOI estimatesOverall, not too bad. If the estimates were 100% perfect all data points would lie on the black line where estimated TOI = actual TOI. While this method is certainly not perfect, it does a pretty good job ranking players by TOI, if not getting their TOI exactly correct. What is somewhat concerning is that TOI is underestimated for players with less than 15-17 minutes actual TOI/60 and overestimated for players above that mark. The effect is clearer in this graph of percentage error in estimates:

Graph of errors in NHL player TOI estimatesIntuitively, it makes sense for this to be the case. Good players—those who play more—generally have a higher on-ice goals rate (for + against), meaning that if they’re on the ice for 40% of their team’s goals for/against (est. 24 mins/60), their share of ice time is probably more like 33-35% (actual 20-22 mins/60). We could try to account for this effect by applying a curve to estimates but this isn’t something I’ve explored.

To check the accuracy of estimated QoC/QoT metrics, we could do a similar study to see how NHL players’ actual and estimated numbers differ, but I haven’t done this yet. When Eric T. did something similar the results lined up well, particularly for defencemen. A few people who watch the CHL closely that I’ve run these estimates by have said they’re generally in line with that they see and/or measure, again especially for defencemen.

To summarize: players’ ranks in TOI are probably fairly accurate, but high-TOI players are likely overestimated by up to a few minutes and low-TOI players likely underestimated by a similar margin. QoC and QoT are rough metrics by definition but generally reflective of reality and perhaps more accurate for defencemen than for forwards.

Acknowledgements and feedback

I’d like to thank those who took a look at this project as it developed and gave invaluable input and feedback on the data, estimates, functionality, presentation, and more.

If you have feedback or ideas about Extra Skater’s new CHL stats, I’d love to hear it. Leave a comment below or contact me.

2 thoughts on “Introducing CHL statistics

  1. “…if a player is on the ice for 40% of the goals scored by or against his team, we assume he was on the ice for about 40% of the minutes (i.e. 24 minutes/60)”

    Isn’t this a crude assumption? How did you check the accuracy of this? Are you also assuming the converse is true: if he was on the ice 40% of the minutes, he was on the ice for 40% of goals scored/against…

    Thank you.

    p.s. I’m a big fan of your extraskater. I’m a recent Mathematics grad from CMU who loves hockey.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>