Hockey and Geography: Analyzing where North American NHL players were born

Greg Feliu
9 min readMay 15, 2021

Professional sports are a huge source of local pride for a lot people. In some countries, simply wearing the wrong jersey at the wrong time and place is practically a death sentence. While North American professional sports are not quite that extreme, without a doubt there are millions of people who get immense local pride from watching their favorite team play.

Hockey fans (potentially) about to fight for their team. Credit

The problem with this is that professional athletes rarely have quite the same local pride as their fans do. This, of course, is because they rarely have any connection to the home city for the team they play for — at least at first. Heck, they only even play their games there half of the time!

This realization got me thinking: which cities actually do have local players to choose from? If, in some far off scenario, athletes had to play for their local team in an Olympics-style tournament, which teams would succeed? Which would sit on the sidelines? Being a hockey fan, I wondered which teams would win gold in this fictional tournament.

While we’re on the subject, hockey is a very regional sport. Weather and culture play a huge role in determining which sport a child chooses to play at 5 or 6 years old. In Hawaii, for example, there is only one ice rink in the entire state (and by extension, only one rink within a 2,500 mile radius!). This is not surprising. What is surprising is that Arizona has 4 NHL players, 3 more than Alaska does! So while weather and culture play a huge part in determining where hockey takes hold, it sure isn’t everything…

Therefore, in order to answer this question, I analyzed where North American NHL players were born. I examined the highest producing states/provinces, the “hockey clusters” of North America, which teams are the least and most likely to have local players, and, most importantly, which teams would stand the best chance at winning this intra-NHL olympics tournament. I find that Toronto is by far the best hockey city in America, in terms of how many NHL players were born in or near the city.

Data

To answer my question, I queried the NHL API to obtain data on each player in the NHL. I analyzed where the players born and saved the North American players’ info in order to examine which players are local to an NHL team.

Once the city names, state/province and country were found for each player, I found the coordinates of each of these places. With this information, I was able to plot, analyze and play with the data.

Results

Firstly, I found that the average Canadian is much more likely to play in the NHL than Americans (full country breakdown here). Despite the United States having 8.8 times more people than Canada, there are more Canadians in the NHL than Americans! The average Canadian is 13.7 times more likely to play in the NHL than the average American! It just goes to show how much more popular hockey is in Canada than in the U.S.

All player birthplaces (blue dots) and NHL arena locations (red circles).

These figures don’t represent all parts of Canada and the U.S., of course. After all, it’s not surprising that hockey is more popular in say, Montreal than it is in Honolulu. Therefore, I analyzed the data at a more granular level: the state/province level. Here, I map the states/provinces with NHL players and the total number of players currently playing in the NHL.

Choropleth map of all U.S. states and Canadian provinces where an NHL player was born.

As you can see above, Ontario has a ton of NHL players. If we exclude Minnesota and Michigan, more NHL players were born in Ontario than in the rest of the United States combined! The total number of players doesn’t tell the whole story, however. Ontario is easily the most populous province in Canada. Therefore, it’s not surprising more players are from there. A better measure of how hockey crazy a state/province is rests on the number of hockey players per capita. In the following chart, the size of the state/province is adjusted to how many NHL players there are per capita relative to the most represented province (Saskatchewan).

Cartogram showing the number of NHL players per inhabitant for each state/province.

Saskatchewan is sure punching way above its weight! (Also, Prince Edward Island is similarly as impressive, albeit much smaller). The only Canadian province not represented is Newfoundland and Labrador, which has zero NHL players. Below, the number of players per capita is shown for the top 10 states/provinces. Nine of the top ten are located in Canada. Turns out Minnesota is more hockey crazy than certain Canadian provinces!

Scatterplot of the NHL players per capita in U.S. states and Canadian provinces.

Even in the most hockey crazy province, only 2 in every 100,000 people are currently in the NHL. I hope this puts the odds of being an NHL player in perspective for young hockey players out there!

Hockey Clusters in North America

We narrowed down the most hockey obsessed area from the country level down to the state/province level. Can we narrow it down even more? Yes we can!

Clustering is a typical data science task where we identify similar groups within a dataset. In this case, that would mean using the location of an NHL player’s birth to group them into clusters. We will find the most logical number of clusters within the data using a variety of Machine Learning techniques and then identify the locations of those clusters. In this case, I do a bottom-up approach: I identify high-density areas, and keep bringing data together to identify clusters in the data. To do this, I employ the DBSCAN algorithm to identify which birthplaces most easily form into a cluster. Once this is done, I use the Clustergram Python library and various statistical tests to identify how many clusters make the most sense, and then group these clusters together using the KMeans algorithm. Ok, enough talking, what did I find?

Clustering the majority of NHL players’ birthplaces into 12 clusters.

I found that after filtering the data, 12 clusters emerged: 4 in the U.S., 6 in Canada, and 2 cross-border clusters. Together, these clusters account for 65% of all North American players, and 45% of all NHL players! As we can see below, far and away the most important cluster of the bunch is Toronto. As a matter of fact, 1 in 5 NHL players is from the Detroit/Toronto/Buffalo triangle! Very impressive!

How many current NHL players are from hockey clusters in North America

NHL Teams and Local Players

Finally, we return to the question of which NHL teams have the most local players in the NHL. Here, I define local as a player who was born within 60 miles of the NHL team in question and have that team as the most local team.

In total, 73% of all NHL players fit our definition of being a local player. Unfortunately, not every team can have 103 players as the Toronto Maple Leafs would have. Therefore, if we consider that each team has a maximum number of players (20 on the bench, the definition used here), 49% of NHL players can play for their local team. That’s a lot! So what percentage of NHL players actually play for their local team? About 9%. Considering all NHL players, not just players born near an NHL team, 4.5% of players play for their local team. It certainly goes to show the importance that skill and salary have in determining where someone plays in the NHL!

If each team could only play with local players, which teams could field a team? Here are the teams and how many players could play on that team:

  • Toronto Maple Leafs: 103
  • Detroit Red Wings: 36
  • Vancouver Canucks: 31
  • Montréal Canadiens: 30
  • Minnesota Wild: 26
  • Boston Bruins: 23

All in all, only 6 teams could field a team. What’s interesting is that four of the original six teams are in this list! I would love to see a new original six tournament with these teams, that’s for sure!

So which teams have the most local players currently playing on their team? Unsurprisingly, Toronto is leading in this category with 7. The next closest teams (Boston and Minnesota) have 3. The fact that Toronto has local players on their team isn’t completely surprising. It would be more of a surprise if they had no local players on their team, quite frankly. Therefore, if we looked at teams that have the highest percentage of local players playing on their team, Columbus and Los Angeles come out on top! Of the 4 players that were born within 60 miles of their arena and have this team as their most local team, both Columbus and Los Angeles currently have one of them. That’s pretty dang rare if you ask me!

Birthplace Isn’t Everything…

While this analysis is certainly interesting, it doesn’t tell the whole story. Sometimes where the player was born is a very inaccurate way of describing where they grew up. If we only looked at one’s birthplace, we would be talking about how Recife, Brazil is an unlikely hockey stronghold in South America. In fact, at least 3 players born in Arizona and Mississippi were only born there because of their professional hockey playing fathers! In one case, the player only lived in the U.S. for 3 months before moving to Quebec! Where the player grew up is clearly more important since that is how they became a pro hockey player — not the location of their birth.

On the flip side, someone born in a hockey cluster is not guaranteed to be a great hockey player either. Players quit playing hockey at every level, no matter the location.

Location isn’t everything but it sure does explain a lot!

Conclusion

Professional sports are deeply tied to their home markets. Although it rarely happens, a franchise changing locations causes a lot of anger among fans. People sometimes even support the franchise that moved decades after they left! The players, however, aren’t often so tied to their home markets. Therefore, it was great fun looking through the data in order to identify how often players play for their home teams, and which teams have the most local players on their teams.

Through this analysis we learned a lot about where hockey is played, and which states/provinces are more hockey crazy than others. Turns out, Saskatchewan and Prince Edward Island produce a lot of NHL players, almost 4 times as many per capita as Quebec! Further, we see that Minnesota is just as hockey crazy as Canada in that it produces just as many NHL players per capita as some Canadian provinces.

We also learned about how much people like hockey at a local level, too. Through the use of cluster analysis, we were able to identify 12 metro areas that produced 45% of all NHL players in 2020. Further, we saw that Toronto is an absolute powerhouse for producing NHL players. The Toronto/Detroit/Buffalo area produces 1 in 5 NHL players! This is almost definitely the most concentrated area to find future NHL players in the world!

We can also look at the teams that have local players currently playing on their roster. While we aren’t surprised that multiple players local to Toronto play for the Leafs (7), we should be surprised that Columbus and Los Angeles each have a local player on their team. In both cases, the teams have one of the four local players that could play on their team!

Finally, returning to our original question: which teams would stand the best chance at winning this intra-NHL olympics tournament? The following teams could field a team to play in this theoretical tournament: Toronto, Detroit, Vancouver, Montreal, Minnesota and Boston. Once again, we see the absolute dominance Toronto has: the Leafs could choose from 3 times as many players as the next closest team. My money’s on them if there were an Olympics-style tournament for NHL teams!

If you would like to look at the code that produced this report, I encourage you to check out the GitHub page for the project.

If you want more NHL analysis, I wrote a blog examining the possible effect have more periods with the long change would have on NHL gameplay.

--

--

Greg Feliu

Data Analyst | Data Engineer — Interests in language, sports, marketing and geographic visualizations