Graph all the things
analyzing all the things you forgot to wonder about
2019-02-17
interests: rock climbing, basketball, statistics, interactive visualizations
Professional basketball players tend to be tall - very tall. The average NBA player's height is 201cm, or 6'7". In contrast, professional rock climbers in the International Federation of Sport Climbing (IFSC) have very ordinary heights compared with the general population:
I wanted to get a rough feel for how much of a role height plays in these sports, so I made a simple model to describe it. The model could be applied to any measurable, immutable talent - not just height.
I also went through the trouble of scraping the professional athlete's heights that you see in the above histograms. I obtained a list of climbing athletes from the top bouldering and lead competitors for 2018 on the IFSC site. For basketball athletes, I used the full roster for NBA and WNBA during the 2018-2019 season. I obtained heights either from the respective association's website if available, or from Wikipedia.
The model I made describes the probability distribution of heights, given two factors: the sport's selectivity and the proportion of athletic ability that is due to height. If a prior on these factors is chosen, the model can also take height data to give tighter credible intervals for their values.
My results:
I surprised me a little that the distribution of climbers' heights was so perfectly ordinary (my initial hunch was that they tended to be a little on the shorter side). I know less about basketball, but found it interesting that height plays a significantly larger role in WNBA than NBA. I discuss this more below.
To keep this model as simple as possible, let's make some assumptions:
You might be skeptical at this point - in sports where height is a strict advantage, wouldn't even the tallest people still need to be particularly talented and hard-working to become professionals? Apparently not, since roughly 17% of men over 7 feet tall (213cm) aged 20 to 40 are in the NBA.
Using our assumptions, an athlete's ability is
for some and , where is height and is the influence of other factors. We have also assumed that To simplify this, we can shift and rescale into such that where both and are standard normal random variables.I devised this model so that is the proportion of ability explained by height, and the sign of indicates whether height helps or hurts.
By our assumption, professional athletes have in the top proportion of the pool of athletes. Let's call the sport's selectivity.
I have some rough estimates for it:
Since both and are normally distributed, ability is distributed like a random normal with variance . This means the cutoff to being in the top of potential athletes is having a of at least where is the normal cumulative distribution function. By Bayes' theorem, the probability distribution for normalized height , given that someone is a professional athlete, is
By definition, . is simply the unit Normal distribution function. And is the probability that all other aspects of the athlete's ability exceed , or the probability that a random unit Normal exceeds .This model isn't perfect, but it gives a simple, interpretable, and plausible estimate of how much a professional athlete (or any sort of outlier) owes to one of their characteristics. You can play around with the values of and below. Try to choose them such that the distribution resembles one of the histograms of athletes' heights above.
Proportion of performance explained by height: 0.010
Selectivity (i.e., one in this many people can perform at this level): 10
By this point, it should be pretty clear that height plays a tremendous role in basketball players' height, and almost no role in climbers' heights. But to give a serious effort at answering how much of a role height plays in each sport, I chose a uniform prior on , a specific value prior on , and performed Bayesian inference.
This gave the following 95% credible intervals for in each sport:
Sport | ||
IFSC (women) | (-0.078, 0.108) | (, 0.012) |
IFSC (men) | (-0.129, 0.040) | (, 0.014) |
WNBA | (0.596, 0.659) | (0.355, 0.434) |
NBA | (0.528, 0.591) | (0.279, 0.350) |
In other words, a whopping 27.9% to 43.4% of a typical basketball professional's success is due to height! Meanwhile, it's incredibly unlikely that any meaningful amount of a professional rock climber's success is due to height. Almost certainly less than 1.4%.
I had expected that tall climbers would be slightly disadvantaged, but did not find this to be the case. My rationale was that nature's best climbers are small creatures like ants and geckos. One friend of mine was certain that height would be an advantage, saying the only reason we don't see more extreme height is that climbing isn't selective enough yet. However, even at the modest selectivity estimate here, it's still unclear whether height helps or hurts at all.
One surprise was that height plays a significantly larger role in WNBA than NBA. Perhaps WNBA teams select more strongly for height. Or perhaps WNBA isn't quite as competitive, so other factors don't play as large of a role. I kept the selectivity the same for both NBA and WNBA, but wasn't sure - if WNBA is actually less selective, then the percent of performance attributable to height would be even greater. In any case, I'm not a basketball expert and don't have full context here.
The advantage of studying height over something partially mutable like weight is that there is one fewer feedback loop: athletes cannot modify their height based on their athletic success. A caveat like that would have made this analysis totally invalid. Since height is immutable, a basketball athlete can be 1 in 1 million by having a 1 in 1,000 height and a 1 in 3,000 combination of coordination, work ethic, and musculature. If athletes could alter their heights, the 1 in 1 million people with the best coordination, work ethic, and musculature would simply choose the optimal height and replace such an athlete, regardless of whether height is a major or minor improvement.