I’ve been trying to stay out of saying anything about the Hugos Awards, mostly because lots of people are saying lots of things already and I haven’t felt like I have anything to add. But then Jim Hines posted today speculating about MATH, and, well, I got nerd-sniped.
Here’s the original (long, long) comment I left on his blog. I finally decided I couldn’t not do a normal distribution and standard dev, so I came back here for it, but the numbers in the original comment might be a bit more intuitive for non-math folk than what I’m going to do here.
The Hugo Awards are a SFF award nominated by popular vote. There is some controversy (understatement) about the nominations this year. I’m not going to get into that here, just going to display some numbers.
It would, however, be disingenuous not to state my own bias, which is that I think institutional discrimination against women and people off the gender binary exists and is a problem. I’ve allowed that bias to affect how I frame my wording (and I’ve editorialized at times), but I’ve performed the math exactly as I believe is correct. Since it’s very possible to make statistics seem skewed toward a particular viewpoint by bad-faith numerical sleight of hand, I want to state up front that I have not done so here — any poor mathematics or misunderstanding of confidence levels is due to (1) my lack of background in stats or (2) genuine error.
What I’m doing, and what it means
The four writing categories for the Hugo Awards have 5 nomination slots each, for a total of 20 nominations for fiction writing. I’m going to make the probability distribution for the likelihood of a particular gender split (e.g., find the probabilities of a 10/10 split, or a 9/11 split, or a 15/5 split, etc). This will approximate a nice normal distribution. If you don’t know what that is, that’s okay — the important part is the next bit.
Once I have the probability distribution, I’m going to take the standard deviation. Standard deviation is a very useful statistical tool that tells us the likelihood something will be in a given range of numbers. For example, it’s not terribly useful to look at the probability of a exact 8/12 split — it’s more useful to look at the probability the gender split will be within a certain range of numbers.
For a normal distribution, 68% of the data will fall within 1 standard deviation of the mean (the mean = the average), 95% will fall within 2 standard deviations of the mean, and almost 100% will fall within 3 standard deviations of the mean (99.7%). Once we get out to three standard deviations from the mean, we’re talking about extreme outliers.
This will tell us whether a given gender distribution is within what we’d consider an expected year-by-year fluctuation from 50/50, or whether, assuming a 50/50 gender split, it would be…well, an extreme outlier.
- I’m a mathematician but NOT a statistician; I’ve never actually studied stats. I only know enough basics to get me in trouble. If you know more stats than I do, please jump in!
- I’m considering gender to be 50/50 split on a male/female binary because I couldn’t quickly find stats on nonbinary folk. (Sorry!!)
- I’m the type of mathematician who hasn’t worked with numbers in so long that I’m very prone to arithmetic mistakes. If you find any, please shout.
- This calculator, because I’m lazy: http://graphpad.com/quickcalcs/probability2
- This graphing tool, to make pretty pictures: http://www.wolframalpha.com
I’m keeping it easy: 20 nomination slots, 50% probability of a given gender getting a nomination.
I haven’t talked about much specific Hugo data here, but when I have I’ve pulled it from the graph in Jim Hines’ post.
Binomial Probability and the Frequency Distribution
Binomial probability gives us the following distribution — conveniently, the calculator above gave it to me all in one go when I entered n=20 (20 nomination slots) and p=.5 (50% probability of male or female). The following table is copy/pasted verbatim from the results. For non-math people, note that we’re not calling a male person or a female person in a nomination slot a “success” or a “failure” in the semantic sense — here “success” and “failure” are neutral probability terms.
Binomial, Poisson and Gaussian distributions
Number of trials (or subjects) per experiment: 20
Probability of “success” in each trial or subject: 0.500
Cool! This gives us a frequency distribution.
The Normal Distribution
Binomial probability (what we just used to get the frequency distribution in the above table) with p=.5 and a reasonable number of data points is known to approximate a normal distribution, aka a bell curve. Here’s a normal distribution via Wolfram Alpha of these data:
Notice that it’s centered around the mean (average) of 10, as we would expect. We’ve got the number of nominees of a given gender on the x-axis (it doesn’t matter which gender we choose, as it’s symmetric — we could say the x-axis is the number of male nominees or we could say it’s the number of female nominees), and the percent probability we’ll land on that number of nominees on the y-axis.
Whether we look at the table or the graph, we’re hitting about a 17-18% probability of an even 10/10 split, and it drops off quickly on either side, until a 0/20 split in either direction has almost a 0% probability.
(I actually found the standard dev first and used that to graph the normal curve, but shhh! I think it’ll make more sense to non-math people to write it in this order.)
One reason it’s so lovely to talk about standard deviations in a normal distribution is it gives us very pretty ranges that other people who know basic stats can easily grasp — if you say “more than a standard deviation from the mean,” people who know what standard deviation is will have an idea of how hefty a divergence that is. Here’s a great visualization for standard deviation on a normal distribution:As you can see, the dark blue is within 1 standard deviation of the mean and takes up 68.2% of the data. The lighter blue shows going out another standard deviation from the mean, and the even lighter blue goes out to a third standard deviation from the mean, where the probability of landing is very close to zero.
Standard deviation has a complicated formula that’s beyond the scope of this post — I just used my calculator. The standard deviation for these data is about 2.236.
For a normal distribution, that means 68% of the data fall within 2.236 of the mean. In other words, 68% of the data fall within a difference of 2.236 from 10, or between 7.764 and 12.236.
It’s easy to check that this is about right: if we go to our table above and add the “exact probability” column for 8, 9, 10, 11, and 12, we get a bit above 70%. It’s not exact because our frequency distribution is only approximating the normal distribution, but it’s a very good approximation, and it’s generally considered an appropriate model for binomial distributions with non-extreme probabilities and a reasonable number of trials.
One Standard Deviation, Two! Three Standard Deviations, More!
Remember that about 68% of the data will fall within 1 standard deviation of the mean, 95% will fall within 2, and 99.7% will fall within 3. In other words, another advantage of standard deviation is that it gives us some nice arithmetical shortcuts, as follows:
- 1 standard deviation: 7.764 – 12.236
- OR: About 68% of the time, the gender split will be 8/12 or closer.
- 2 standard deviations: 5.528 – 14.472
- OR: About 95% of the time, the gender split will be 6/14 or closer.
- 3 standard deviations: 3.292 – 16.708
- OR: About 99.7% of the time, the gender split will be 4/16 or closer.
- A gender split wider than 4/16 is an extreme outlier.
Note that though a split wider than 4/16 suggests something very statistically unlikely is going on, it does not say why, and it does not assign intent. My lived experience suggests that intentional sexism should not generally be assumed when systemic bias will suffice, and in a process like writing, publishing, publicity, and awards nominations, there are plenty of stages at which institutional bias can manifest itself. This does not, of course, mean there is not a problem — in fact, it would mean the problem may be one that requires more thought, awareness, and effort to address.
I’ll further note that if you consider the years 2010-2014 (none of which had fewer than 7 nominations for either gender) and compare them to 2015, and this leads you to conclude (along with a preponderance of other data, I am aware) that something untoward happened in 2015, I’ll further note that even one person or one small group of people with a particular subgenre taste having chosen a fantastically statistically unlikely slant of genders still does not imply malicious sexism. What it does imply, in my opinion, is a variety of other extremely upsetting problems, exacerbated by the fact that nonmalicious sexism can be much, much harder to combat.
So. What was the gender split in the writing categories is this year?
- Yes, I’m aware there are factors affecting that 50/50 probability, even in years that aren’t this one — potentially factors at every step in the publishing process, not just the nominating-for-awards stage. This post could be, in that vein, viewed like a proof by contradiction — I’m showing the probabilities of expected fluctuations, and if you’re seeing greater extremes, that might indicate the starting assumption of 50/50 gender blindness at all steps is, in fact, incorrect.↵
- This distribution definitely has a non-extreme p — I tried to figure out if 2o trials is a reasonable number for approximating via a normal distribution and didn’t get anything definitive, although I did compare by hand and the numbers all seemed pretty close. But if you distrust the model, notice that I’m really only using this one to make relatable statements about the exact raw data that you can look at in the table above — if you want to, you can define your own terms to look at probability ranges by adding the numbers in the third column, and you’ll come to the same conclusions. In other words, about 70% of the data fall between 8 and 12 whether we use the vocabulary “within one standard deviation of the mean on a normal distribution” or not.↵
- You could, again, find the exact percentages by adding the numbers in the table. But this is faster.↵
- As far as I know “outlier” doesn’t have a specific statistical definition, but I’ve seen it used to mean “three or more standard deviations from the mean,” so that’s what I’m doing here.↵
- If you do compare, be aware that some of those years had greater or fewer than 20 nominations — presumably because of ties or the 5 percent rule — and I’ve not accounted for those sorts of variations here. The ideas should be broadly applicable, however, and if we’re speaking roughly, I’ll note that 4 out of the 5 years from 2010-2014 had at least 8 nominations from both genders, and the other year had a 7/11 split, which is perfectly in line with the numbers above: if 4/5 years fall within the 68% (roughly) and 1/5 falls outside the 68% but within the 95%, that’s about what we’d statistically expect.↵
- Well, at least one person involved has nonfiction writings that would support such a conclusion, but I will not extend his philosophies to the rest.↵