Friday, December 30, 2016

Halfway Correlations

We are nearing the halfway point of the season. With 6.5 seasons under our belts, we have a pretty good data set of past results to work with, so I thought it would be interesting to compare the halfway point results of previous years with the final results to see what, if any correlation existed.

My gut told me that the halfway point is likely a good indicator of final results. The effect of hot and cold players tend to have leveled out, and most teams have experienced the effects of injuries. That, compounded with the diminishing effect of trades or add/drops and impending trade deadline, means what we have at this stage tends to be close to what we are stuck with. Or so I believed -- so I decided to do an analysis.

To do this, I took the ranking of our teams at the approximate half-way point of each of the past seasons (generally speaking, the 1st week of January, with the exception of the lockout-shortened season, which I used March 10, 2012). I then compared the final season rankings.

With two data sets, we are able to do a statistical analysis to determine the coefficient of correlation, or r value. The r value is a number which indicates the strength and direction of correlation, on a scale from -1.0 to +1.0. The following scale helps interpret the results.

  • Exactly –1. A perfect negative relationship (in other words, the rankings are perfectly reversed)
  • –0.70. A strong negative relationship
  • –0.50. A moderate negative relationship
  • –0.30. A weak negative relationship
  • 0. No linear relationship (in other words, the ranking at the end of the season is effectively randomized in comparison to the half-way point). 
  • +0.30. A weak positive relationship
  • +0.50. A moderate positive relationship
  • +0.70. A strong positive relationship
  • Exactly +1. A perfect relationship (in other words, the ranking is identical at the halfway point to the end of the year). 

Overall Results
This is the tl;dr section. Over the entirety of the Keeper League, the r value between the halfway ranking and our final positions is 0.862, indicating a very strong positive relationship. It is even stronger if we exclude our first season (and I would argue that may be appropriate, since it was in effect a single-season league without the moderating effect of keepers). The r excluding that season is 0.892.

Of the 94 total results, 75 teams have moved down 1 or 2 positions, stayed the same, or moved up 1 or 2 positions. There are, of course, outliers. But they are rare. Of the 94 sets of data, teams moved up three or more positions only nine times in the history of the league. Similarly, teams have only moved down three or more positions 10 times.

In other words, the vast majority (80%) of teams finish very close (within 2 positions) of their position at the halfway point. The distribution is a normal bell curve, with many of these teams not changing position at all.

Season-By-Season Overview

2010-11

Overall r value of 0.662.

I suspected that the effect of tanking might have a stabilizing effect (ie - the decision to tank would be made earlier in the season, plus tanking is easier than picking up new points). So I did an analysis of the top 8 finishers as well, returning an r of 0.402.

This season featured the two biggest jumps in league history, with the then-Victoria Krugars moving from 8th to 1st place over the course of the season. The Milan Micahleks entered the second half of the season in 1st place, but collapsed to 10th.

Those two jumps resulted in this first season being our weakest in terms of correlation, but excluding those two remarkable outliers, the r value becomes very close to 1.00.

2011-12

Overall value of 0.816
Top 8 r value of 0.651

Again, strong correlation amongst all players, with a slightly weaker (but still strong) correlation amongst the top 8. The biggest contributor to the top 8 weaker correlation was the Moilers, who moved from 11th to 6th place.

2012-13

Overall value of 0.948
Top 8 r value of 0.845

Our most consistent season ever, with not a single team moving more more than 2 positions. Keep in mind that it was a lockout shortened season, so there were 24-ish games to move position, rather than 41.

2013-14

Overall value of 0.918
Top 8 r value of 0.807

Another remarkably consistent season. Only the Mackhawks (+4), Los Amjawors Kings (-4) and Quebec Rordiques (-3) moved three or more positions - the rest of us moved very little. Of note, the top 3 at the halfway point were the top 3 at the end of the season. If the Mackhawks would have stayed in 8th place, the top 8 would have finished exactly where they were at the halfway point.

2014-15

Overall value of 0.918
Top 8 r value of 0.661

Interestingly, the r value of this season was identical to the season before. That's a remarkable coincidence, but also highlights how consistent the data-sets are.

2015-16

Overall value of 0.864
Top 8 r value of 0.412

If there's to be any hope for those outside a money or playoff position, this season is it. Overall, the correlation remains strong, but in the top 8, the Los Amjawors Kings dropped from 3rd to 6th and the Fylanders leapt from 9th to 3rd (the second greatest increase in KL history).

This season also featured the greatest number of teams moving 3 or more positions, with five (as detailed below, only teams have moved up or down three or more positions only 19 times since inception).

That said, like most years, the majority of GMs finished within 2 positions of their mid-season markers.

Other Observations

While movement up and down anywhere in the ranking is important for draft position, some moves are more important than others. In particular, moves from out of a playoff spot to a playoff spot, or movement into a money position. 

Unfortunately, the news here is not terribly good either. 

Only 6 teams have ever moved into a money spot from a position outside at the halfway point (out of a total 24 top 4 finishes):
  • 2015-16 Fylanders (9th to 3rd)
  • 2014-15 Patrik Stefans (5th to 2nd)
  • 2013-14 Mackhawks (8th to 4th)
  • 2012-13 Moilers (5th to 3rd)
  • 2010-11 Joshfrey Krupuls (8th to 1st)
  • 2010-11 Patrik Stefans (5th to 4th)
You will note that four of these moves are of three or more positions, which, as we've noted, has only ever happened nine times. Two of them (Fylanders, with 6 positions, and the Krupuls, with 7 positions) are the two biggest jumps in the history of the league. To put it bluntly, if you are outside the money at the halfway point, history suggests that you will need a truly exceptional second half to break in

The story is similar for moves into the playoff position. Since the introduction of the playoff format, we have had 32 GMs finish in the top 8.  Of those, only four of them found their way into playoff contention over the last 41 games:
  • 2015-16 Fylanders (9th to 3rd)
  • 2014-15 Winter Claassics (10th to 8th)
  • 2013-14 Joshfrey Krupuls (9th to 8th)
  • 2012-13 Mackhawks (10th to 7th)
GM Consistency

Not all GMs are created equal. Many of our GMs are remarkably consistent in terms of the correlation between mid- and end-points of the season. Leading the pack are the Schizzarks, with an individual r of 0.983. The Patrik Stefans are right behind at 0.982. These are shockingly consistent numbers -- recall that 1.00 is perfect correlation.  

Other GMs with r values of greater than 0.9 are the G-Phil's Flyers, Los Amjawors Kings, Teeyotes, Powder Rangers, Vanrooser Canicks, Moilers and the Quebec Rordiques. 

If you are looking at the present standings and want to make bets on final ranking, these teams may be your safest picks -- the GMs who have shown the greatest level of consistency across the final 41 games. This cuts both ways -- if these GMs need to move up a few positions, past performance suggests that isn't likely to happen without a serious deviation from past behaviour.

At the other end of the scale are the Milan Micahleks (0.044) and the Fylanders (-0.252). 

The Micahleks are actually remarkably consistent when you ignore their disastrous first season collapse (losing 10 positions). Other than that, they have finished within 2 positions of mid-season in each of the last 5 seasons, for an r value of over 0.9. 

The Fylanders, on the other hand, are the only team with a negative correlation. In other words, they are rarely in the same position between the two poles. 

That is perhaps unsurprising, given the sheer number of trades the Fylanders make over the course of the season. However, last year was the only year in which the Fylanders finished ahead of their mid-season ranking, In most other years they have mostly lost ground. 

The Patrik Stefans also tend to be extremely active on the trade market, but are the second most consistent team in the league, Interestingly, they have only ever lost position once, and it was only by 1 spot. The Stefans have either moved up or stayed the same in every other season. 

The comparison is interesting, as it may give some insight into trading behaviour. The Fylanders perhaps make trades with greater upside (the 6 position jump) but strike out more often, while the Patrick Stefans may play it a little safer with their trades in exchange for more modest (but more predictably positive) bumps in position. 

Conclusion:

A lot can happen between now and the end of the season, but if you are a betting person, it is very likely that the top 8 you are looking at today will contain most of the same GMs at the end of the season. Generally speaking we only lose 1 team out of the top 8 by year's end, and it is typically two bubble teams swapping position. 

Four of the most consistent teams in the league sit in the top 4, and 6 of the top 8 have r values of > 0.9. With 80% of ranking moves being of 2 or less positions, you are likely looking at most of the names you will see at the end of the year, and in more or less the order they will appear. 

There will likely be one or two outliers who move three or more positions up or down, but more significant move (ie - Fy's 6 positions last year, or JKru's 7 positions in the first year) are very, very rare.

Perhaps the most useful part of this exercise is we can determine at the halfway point who should be a buyer, who should be a seller. We don't need to wait for the trade deadline to determine that.