React

Graph all the things

analyzing all the things you forgot to wonder about

Height and Natural Talent

2019-02-17
interests: rock climbing, basketball, statistics, interactive visualizations

Professional basketball players tend to be tall - very tall. The average NBA player's height is 201cm, or 6'7". In contrast, professional rock climbers in the International Federation of Sport Climbing (IFSC) have very ordinary heights compared with the general population:

Women's distribution of general population height (~165cm), IFSC climber height (~165cm), and NBA height (~185cm) Men's distribution of general population height (~175cm), IFSC climber height (~175cm), and NBA height (~200cm)

I wanted to get a rough feel for how much of a role height plays in these sports, so I made a simple model to describe it. The model could be applied to any measurable, immutable talent - not just height.

I also went through the trouble of scraping the professional athlete's heights that you see in the above histograms. I obtained a list of climbing athletes from the top bouldering and lead competitors for 2018 on the IFSC site. For basketball athletes, I used the full roster for NBA and WNBA during the 2018-2019 season. I obtained heights either from the respective association's website if available, or from Wikipedia.

The model I made describes the probability distribution of heights, given two factors: the sport's selectivity and the proportion of athletic ability that is due to height. If a prior on these factors is chosen, the model can also take height data to give tighter credible intervals for their values.

My results:

  • It is unclear whether height helps or hurts professional climbers, but almost certainly less than 1.4% of performance is due to a height advantage either way (for both men and women).
  • Height helps NBA players, accounting for 27.9 to 35.0% of their performance.
  • Height helps WNBA players, accounting for 35.5 to 43.4% of their performance.

I surprised me a little that the distribution of climbers' heights was so perfectly ordinary (my initial hunch was that they tended to be a little on the shorter side). I know less about basketball, but found it interesting that height plays a significantly larger role in WNBA than NBA. I discuss this more below.

Modeling Natural Talent

To keep this model as simple as possible, let's make some assumptions:

  • An athlete's quantified ability is the sum of two values: the influence of their height, and the influence of all other independent factors.
  • The influence of height is proportional to height.
  • Height is normally distributed. The influence of all other factors is normally distributed.
  • Professional athletes have the highest ability of anyone who would be professional if they could.

You might be skeptical at this point - in sports where height is a strict advantage, wouldn't even the tallest people still need to be particularly talented and hard-working to become professionals? Apparently not, since roughly 17% of men over 7 feet tall (213cm) aged 20 to 40 are in the NBA.

Using our assumptions, an athlete's ability is

y = ah + bx
for some a and b, where h is height and x is the influence of other factors. We have also assumed that
h \sim N(h_0, \sigma_h^2)
x \sim N(x_0, \sigma_x^2)
To simplify this, we can shift and rescale y, h, x into \hat{y}, \hat{h}, \hat{x} such that
\hat{y} = \alpha\hat{h} + \sqrt{1-\alpha^2}\hat{x}
where both \hat{h} and \hat{x} are standard normal random variables.

I devised this model so that \alpha^2 is the proportion of ability explained by height, and the sign of \alpha indicates whether height helps or hurts.

By our assumption, professional athletes have \hat{y} in the top p proportion of the pool of athletes. Let's call 1/p the sport's selectivity.

I have some rough estimates for it:

  • 1/p \approx 1,000,000 for the NBA, since there are about 200 players and probably about 200 million men who have tried basketball and would play professionally if they could.
  • 1/p \approx 100,000 for competitive rock climbing, since there are about 30 professional competitive athletes in either gender (people who are professional mainly for their non-competitive achievements don't count) out of a pool of about 3 million people who have tried climbing and would be professional if they could.

Since both \hat{x} and \hat{h} are normally distributed, ability is distributed like a random normal with variance \alpha^2 + (1-\alpha^2) = 1. This means the cutoff to being in the top p of potential athletes is having a \hat{y} of at least \Phi^{-1}(1 - p) where \Phi is the normal cumulative distribution function. By Bayes' theorem, the probability distribution for normalized height \hat{h}, given that someone is a professional athlete, is

P(\hat{h}|\hat{y}\ge\Phi^{-1}(1-p)) = \frac{P(\hat{y}\ge\Phi^{-1}(1-p)|\hat{h}) * P(\hat{h})}{P(y\ge\Phi^{-1}(1-p))}
By definition, P(\hat{y}\ge\Phi^{-1}(1-p)) = p. P(\hat{h}) is simply the unit Normal distribution function. And P(\hat{y}\ge\Phi^{-1}(1-p)|\hat{h}) is the probability that all other aspects of the athlete's ability exceed \Phi^{-1}(1-p) - \alpha\hat{h}, or the probability that a random unit Normal exceeds \frac{\Phi^{-1}(1-p) - \alpha\hat{h}}{\sqrt{1-\alpha^2}}.
P(\hat{h}|\hat{y}\ge\Phi^{-1}(1-p)) = \frac{\left(1 - \Phi\left(\frac{\Phi^{-1}(1-p) - \alpha\hat{h}}{\sqrt{1-\alpha^2}}\right)\right) * P(\hat{h})}{p}

Implications

This model isn't perfect, but it gives a simple, interpretable, and plausible estimate of how much a professional athlete (or any sort of outlier) owes to one of their characteristics. You can play around with the values of \alpha and 1/p below. Try to choose them such that the distribution resembles one of the histograms of athletes' heights above.

  Women
  Men
  
Alpha: -1    1
Selectivity: low   high

Proportion of performance explained by height: 0.010
Selectivity (i.e., one in this many people can perform at this level): 10

By this point, it should be pretty clear that height plays a tremendous role in basketball players' height, and almost no role in climbers' heights. But to give a serious effort at answering how much of a role height plays in each sport, I chose a uniform prior on \alpha, a specific value prior on 1/p, and performed Bayesian inference.

This gave the following 95% credible intervals for \alpha in each sport:

Sport \alpha \alpha^2
IFSC (women) (-0.078, 0.108) (4.2\times10^{-5}, 0.012)
IFSC (men) (-0.129, 0.040) (2.3\times10^{-5}, 0.014)
WNBA (0.596, 0.659) (0.355, 0.434)
NBA (0.528, 0.591) (0.279, 0.350)

In other words, a whopping 27.9% to 43.4% of a typical basketball professional's success is due to height! Meanwhile, it's incredibly unlikely that any meaningful amount of a professional rock climber's success is due to height. Almost certainly less than 1.4%.

I had expected that tall climbers would be slightly disadvantaged, but did not find this to be the case. My rationale was that nature's best climbers are small creatures like ants and geckos. One friend of mine was certain that height would be an advantage, saying the only reason we don't see more extreme height is that climbing isn't selective enough yet. However, even at the modest selectivity estimate here, it's still unclear whether height helps or hurts at all.

One surprise was that height plays a significantly larger role in WNBA than NBA. Perhaps WNBA teams select more strongly for height. Or perhaps WNBA isn't quite as competitive, so other factors don't play as large of a role. I kept the selectivity the same for both NBA and WNBA, but wasn't sure - if WNBA is actually less selective, then the percent of performance attributable to height would be even greater. In any case, I'm not a basketball expert and don't have full context here.

Pedantic Clarifications

In this post, height encompasses the other physical changes it incurs, like increase in wingspan and weight.

The advantage of studying height over something partially mutable like weight is that there is one fewer feedback loop: athletes cannot modify their height based on their athletic success. A caveat like that would have made this analysis totally invalid. Since height is immutable, a basketball athlete can be 1 in 1 million by having a 1 in 1,000 height and a 1 in 3,000 combination of coordination, work ethic, and musculature. If athletes could alter their heights, the 1 in 1 million people with the best coordination, work ethic, and musculature would simply choose the optimal height and replace such an athlete, regardless of whether height is a major or minor improvement.