Thursday, January 2, 2014

#PS2 - Fun with Fenwick, of-Corsi-did

In this second instalment of Hardly a Stat Holiday for the Patrik Stefans, I felt no choice but to examine what has emerged as the preeminent statistic in advanced hockey analyses: Fenwick.  Named after the Calgary Flames blogger, Matt Fenwick, the statistic looks to improve on its predecessor Corsi metric. For those who are not sufficiently acquainted by now, you should be, so here's the simplest explanation.  Corsi initially arose as a team differential value between the sum of its own shots taken, missed, and (those which were) blocked, and that of its opponents.  Let's take this fun example from a game between the Toronto Maple Leafs and the Minnesota Wild from Oct 15, 2013.  The Event Summary from the game gives each team's totals for shots on goal, shots missed, and shots that were blocked (not to be confused with its own totals for shots that its players blocked), and the Corsi differentials can be easily calculated accordingly:

TOR
MIN
Shots on Goal
14
37
Shots Missed
11
14
Shots Blocked
5
17
Corsi
-38
38
Fenwick
-26
26

Corsi has come to be better expressed as a percentage which takes one team's total "shot events" and divides it by the grand total of shot events for both teams combined.  In the above example, Toronto's "Corsi For Percentage" (CF%) is (14+11+5)/(14+11+5+37+14+17) = 30/98 = 30.61%.  Conversely, Minnesota's CF% is 69.39%.  If we look at this season's NHL rankings by CF%, you'll see just how abysmally low that above figure is for the Leafs from that isolated game. Not so coincidentally, the Leafs are bottom feeding at 44% at the rough-halfway mark of this season.   



#
1
Los Angeles
1639
1306
43.1
34.3
55.6
2175
1686
57.2
44.3
56.3
2
New Jersey
1417
1129
37.8
30.1
55.7
1897
1497
50.6
39.9
55.9
3
Boston
1751
1489
45.2
38.4
54.0
2391
2005
61.7
51.8
54.4
4
Chicago
1565
1286
40.8
33.5
54.9
2103
1783
54.8
46.5
54.1
5
Ottawa
1683
1509
45.0
40.3
52.7
2305
1994
61.6
53.3
53.6
6
Detroit
1563
1368
41.6
36.4
53.3
2069
1793
55.1
47.8
53.6
7
Montreal
1534
1332
42.1
36.5
53.5
2114
1886
58.0
51.7
52.8
8
St. Louis
1455
1267
38.5
33.5
53.4
1949
1784
51.5
47.2
52.2
9
NY Rangers
1674
1456
43.1
37.5
53.5
2253
2081
58.0
53.6
52.0
10
Carolina
1712
1671
45.3
44.2
50.6
2336
2190
61.8
57.9
51.6
11
Vancouver
1457
1438
38.7
38.2
50.3
1994
1879
53.0
49.9
51.5
12
San Jose
1646
1520
42.8
39.5
52.0
2258
2144
58.7
55.8
51.3
13
Phoenix
1646
1597
43.6
42.3
50.8
2173
2117
57.6
56.1
50.6
14
NY Islanders
1591
1539
40.6
39.3
50.8
2172
2174
55.4
55.5
50.0
15
Winnipeg
1597
1644
40.7
41.9
49.3
2195
2241
55.9
57.1
49.5
16
Minnesota
1400
1437
37.0
38.0
49.4
1883
1925
49.8
50.9
49.5
17
Florida
1529
1628
39.2
41.8
48.4
2049
2121
52.6
54.4
49.1
18
Pittsburgh
1528
1533
40.5
40.6
49.9
2040
2120
54.1
56.2
49.0
19
Colorado
1526
1580
40.0
41.5
49.1
2073
2170
54.4
56.9
48.9
20
Dallas
1409
1579
37.8
42.4
47.2
1990
2084
53.5
56.0
48.9
21
Washington
1479
1563
38.8
41.0
48.6
2043
2150
53.5
56.3
48.7
22
Anaheim
1444
1530
37.3
39.6
48.5
1970
2140
51.0
55.3
47.9
23
Philadelphia
1417
1547
38.6
42.2
47.8
1943
2140
52.9
58.3
47.6
24
Calgary
1448
1599
37.7
41.6
47.5
1951
2164
50.8
56.3
47.4
25
Tampa Bay
1451
1625
38.2
42.7
47.2
1921
2151
50.5
56.6
47.2
26
Columbus
1287
1530
33.6
40.0
45.7
1811
2031
47.4
53.1
47.1
27
Nashville
1344
1491
34.6
38.4
47.4
1800
2053
46.3
52.8
46.7
28
Buffalo
1348
1663
36.2
44.6
44.8
1811
2203
48.6
59.1
45.1
29
Edmonton
1379
1680
36.9
44.9
45.1
1816
2262
48.6
60.5
44.5
30
Toronto
1396
1779
36.9
47.0
44.0
1928
2445
50.9
64.6
44.1


What is Corsi essentially trying to measure?

Corsi has commonly been described as a proxy for scoring chances.  That's it.  It's a metric which carries information about discrete scoring opportunities created by a team against scoring opportunities it gives up to its opponents.  We should be careful not to conflate the metric with scoring chances, as that is a statistic which is subjectively recorded separately.  In any case, the proxy for scoring chances is ultimately supposed to give an objective understanding of how proficient a team is at scoring goals vs. giving up goals viz. generating scoring opportunities vs. giving up scoring opportunities; boiled down to its most extreme claim, it is a predictor of winning games.

The aforementioned blogger Matt Fenwick was keen to point out, as hopefully most of you have come close to concluding likewise, that a shot from the point which gets blocked should hardly count towards a proxy value for scoring chances.  Moreover, a shot blocked by one's own team may in fact be more representative of some defensive skill which presumably contributes to the team's success.  Why on earth should it be factored into inflating the opponent's Corsi?  Accordingly, Mr. Fenwick, modified the metric by simply getting rid of the blocked shot category.


Problem Solved?

If anybody is convinced that this newer Fenwick metric is some magic window into success in the National Hockey League, reconsider that conclusion.  I know as well as the next stat hack not to use some token counter-exemplifying instance to rebut or disprove a general relationship, but I can't help but point out that the Maple Leafs actually won the fucking game above against the Wild 4-1, a game in which the respective Fenwick For Percentages (FF%s) were 32.89% and 67.11%! As a strengthening of the case against Fenwick, I can provide with more conviction the correlation coefficient for team pts vs CF%; it is a tepid .4511, meaning that only a very modest relationship exists.  Surely, Brian Burke, a self-proclaimed advanced-stat-hater would be pleased to know this, and that truculence and pugnacity still play an immeasurably vital role to winning.


Individual Fenwick

The Fenwick statistic has been converted in the same way that the Corsi has to measure the individual player.  It's very simple.  "iFenwick" simply gives a player's total shots on goal plus shots missed.  That's all it is.  Again, it is commonly viewed as a a proxy to a player's ability to generate scoring chances.  But again, and as I'm hopeful most of you will have already realized, it does not credit a player for making a great outlet, a clever drop pass, providing a screen in front, making a hit to free up a loose puck in the offensive zone, etc.  All of those latter events often translate into points, yet none are captured by Fenwick.


Fenwick and the KL

In any case, you may believe that it is as good metric out there at predicting player pts, even if it misses out on those key hockey-goal-generating plays, and especially despite of its older cousin's, the team Fenwick's, underwhelming correlation with team points.  Before writing this post and performing the analysis below, I predicted that the iFenwick would in fact be an even worse predictor for player points than the team Fenwick is for team points.  That is because it carries with it the very same pitfalls of the team Fenwick and also fails to capture any information about assist-related events.  It turns out that my hunch was correct.  Below is a table which shows in decreasing order KL teams by their average iFenwick per 60 minutes as of the Xmas break.  Avg iFenwick is actually slightly negatively correlated with KL rank (-0.1152)!!


 TEAM
KL RANK
Avg iFenwick/60
 Vanrooser Canicks
15
10.04511
 Milan Micahleks
2
9.66286
 Moilers
8
9.61868
 Dicklas Lidstroms
11
9.55842
 G-Phil's Flyers
1
9.40037
 Fylanders
3
9.37320
 Teeyotes
7
9.20821
 Quebec Rordiques
14
8.98260
 W-Benham/Scranton Parkers
16
8.93172
 Patrik Stefans
12
8.81368
 Winter Claassics
6
8.72440
 Powder Rangers
4
8.69511
 Mackhawks
5
8.64050
 Joshfrey Krupuls
10
8.53160
 Schizzarks
13
8.13147
 Los Samjawors Kings
9
7.72316


Discussion and Shortcomings of my Own Methodology

Quickly, before this stat holiday is over in all parts of North America, I will draw attention to a few issues.  First, I used iFenwick/60 instead of straight iFenwick to adjust for players that haven't played many games.  I wanted to give each of you as accurate a value as possible to represent how much your players direct the puck toward the net when they're on the ice, and including, for example, Steven Stamkos' total iFenwick instead of his iFenwick/60 would have been misleading.  That being said, the adjustment so that we are comparing apples with apples (okay, trying hard not to conflate the terms of analogy with hockey slang for assists) might go too far because it's not particularly relevant to the KL how much a player shoots if he's not playing meaningful minutes.  It might be indicative of some potential, especially if management and coaches are big Fewickians and it leads to more ice-time, but that's it, and I would caution against getting too excited if you notice some high iFenwick/60 numbers for some bottom end guys.  I almost picked up Colton Sceviour last week, and I'm glad I didn't.

Another limitation of my analysis is that the numbers only reflect 5-on-5 play.  My suspicion is that this has the greatest impact on defensemen who take many of their shots on the PP.  Not much to say further to that, other than your true iFenwick/60 will differ slightly, and as a rough approximation, you might expect it to go up slightly if you have PP contributing defenders.  The thing is, most of us do anyway.

Lastly, and in addition to reiterating the fact that Fenwick is blind to playmaking, it is also blind to sharp-shooting.  I haven't run the numbers, but I would guess that in addition to holding some good play makers, the rosters above which have a lower iFenwick/60 rank than their overall KL Rank also likely have some really keen snipers.  And by that, I mean, they have some of the most accurate shooters in the game.  Backes, Stamkos, Filppula, Monahan, Foligno, Grabovski, Nielson all fit that description and belong to "underperforming iFenwick/60" teams.

Take from the above what you want.  Many of you already look at this stuff, and some more than others.  To no surpirse, the Michaleks sit squarely in second in both this statistic and the KL standings. However much stock GM Carmody puts into the Fenwick element, it seems to be paying dividends. For others like the The Claassics or 'Jawors Kings, it is perhaps best they continue to presumably ignore such metrics.  


It's back to real work for most all of us tomorrow, but I assure you I look forward to the next Stat Holiday.  NB: You may have noticed that I missed Christmas Day.  This post was initially earmarked for that day, but shortbread and eggnog took priority.  I have another analysis in the hopper that I will publish this weekend as somewhat of a Stat Holiday carry-over.  And after that, I will disappear until Good Friday and Easter Monday, when I'll hit you guys with two more.  That weekend beautifully follows the conclusion of the regular season, so there will be a lot to look back on I'm sure.  Thanks for reading... I know this is fun for at least one of us.

4 comments:

  1. Is your low correlation coeffecient between team points and CorsiFor% based on this year only or previous years as well?

    I think Fenwick Close (Fenwick measured in 5-on-5 situations in a one-goal game in the first or second, or a tie game in the third) is the best measure we have of true possessive ability when it counts because Fenwick and Corsi get skewed in blowout situations. This is one of my favourite blog posts, with some great graphics on the predictive ability of Fenclose: http://www.habseyesontheprize.com/2013/4/4/4178716/why-possession-matters-a-visual-guide-to-fenwick

    It also says that since 2007, a team with a Fenclose above .500 has a 75% chance of making the playoffs.

    ReplyDelete
  2. Micah: yeah, correlation coeff was just this YTD. Another issue I failed to mention regarding the non-existent or even slightly negative correlation between Fenwick and KL standings is that I used the iFenwick's for the entire rosters, not just scoring rosters.

    Moira, as long as you don't shoot the messenger...

    ReplyDelete
  3. great stuff, thanks for posting. You are right to presume that I ignore advanced stats, but until now it was only because I didn't care to take the time to understand them.

    ReplyDelete