Statistics of Gender on the Hugo Writing Nominees: Probabilities and Standard Deviations

I’ve been trying to stay out of saying anything about the Hugos Awards, mostly because lots of people are saying lots of things already and I haven’t felt like I have anything to add.  But then Jim Hines posted today speculating about MATH, and, well, I got nerd-sniped.

Here’s the original (long, long) comment I left on his blog.  I finally decided I couldn’t not do a normal distribution and standard dev, so I came back here for it, but the numbers in the original comment might be a bit more intuitive for non-math folk than what I’m going to do here.


The Hugo Awards are a SFF award nominated by popular vote.  There is some controversy (understatement) about the nominations this year.  I’m not going to get into that here, just going to display some numbers.

It would, however, be disingenuous not to state my own bias, which is that I think institutional discrimination against women and people off the gender binary exists and is a problem.  I’ve allowed that bias to affect how I frame my wording (and I’ve editorialized at times), but I’ve performed the math exactly as I believe is correct.  Since it’s very possible to make statistics seem skewed toward a particular viewpoint by bad-faith numerical sleight of hand, I want to state up front that I have not done so here — any poor mathematics or misunderstanding of confidence levels is due to (1) my lack of background in stats or (2) genuine error.

What I’m doing, and what it means

The four writing categories for the Hugo Awards have 5 nomination slots each, for a total of 20 nominations for fiction writing.  I’m going to make the probability distribution for the likelihood of a particular gender split (e.g., find the probabilities of a 10/10 split, or a 9/11 split, or a 15/5 split, etc).  This will approximate a nice normal distribution.  If you don’t know what that is, that’s okay — the important part is the next bit.

Once I have the probability distribution, I’m going to take the standard deviation.  Standard deviation is a very useful statistical tool that tells us the likelihood something will be in a given range of numbers.  For example, it’s not terribly useful to look at the probability of a exact 8/12 split — it’s more useful to look at the probability the gender split will be within a certain range of numbers.

For a normal distribution, 68% of the data will fall within 1 standard deviation of the mean (the mean = the average), 95% will fall within 2 standard deviations of the mean, and almost 100% will fall within 3 standard deviations of the mean (99.7%).  Once we get out to three standard deviations from the mean, we’re talking about extreme outliers.

This will tell us whether a given gender distribution is within what we’d consider an expected year-by-year fluctuation from 50/50, or whether, assuming a 50/50 gender split, it would be…well, an extreme outlier.

  • I’m a mathematician but NOT a statistician; I’ve never actually studied stats.  I only know enough basics to get me in trouble.  If you know more stats than I do, please jump in!
  • I’m considering gender to be 50/50 split on a male/female binary because I couldn’t quickly find stats on nonbinary folk.  (Sorry!!)
  • I’m the type of mathematician who hasn’t worked with numbers in so long that I’m very prone to arithmetic mistakes.  If you find any, please shout.
The Data

I’m keeping it easy: 20 nomination slots, 50% probability of a given gender getting a nomination.[1]

I haven’t talked about much specific Hugo data here, but when I have I’ve pulled it from the graph in Jim Hines’ post.

Binomial Probability and the Frequency Distribution

Binomial probability gives us the following distribution — conveniently, the calculator above gave it to me all in one go when I entered n=20 (20 nomination slots) and p=.5 (50% probability of male or female).  The following table is copy/pasted verbatim from the results.  For non-math people, note that we’re not calling a male person or a female person in a nomination slot a “success” or a “failure” in the semantic sense — here “success” and “failure” are neutral probability terms.

Binomial, Poisson and Gaussian distributions

Number of trials (or subjects) per experiment: 20
Probability of “success” in each trial or subject: 0.500

Number of
Number of
0 20 0.000% 0.000%
1 19 0.002% 0.002%
2 18 0.018% 0.020%
3 17 0.109% 0.129%
4 16 0.462% 0.591%
5 15 1.479% 2.069%
6 14 3.696% 5.766%
7 13 7.393% 13.159%
8 12 12.013% 25.172%
9 11 16.018% 41.190%
10 10 17.620% 58.810%
11 9 16.018% 74.828%
12 8 12.013% 86.841%
13 7 7.393% 94.234%
14 6 3.696% 97.931%
15 5 1.479% 99.409%
16 4 0.462% 99.871%
17 3 0.109% 99.980%
18 2 0.018% 99.998%
19 1 0.002% 100.000%
20 0 0.000% 100.000%


Cool!  This gives us a frequency distribution.

The Normal Distribution

Binomial probability (what we just used to get the frequency distribution in the above table) with p=.5 and a reasonable number of data points is known to approximate a normal distribution, aka a bell curve.  Here’s a normal distribution via Wolfram Alpha of these data:

Notice that it’s centered around the mean (average) of 10, as we would expect.  We’ve got the number of nominees of a given gender on the x-axis (it doesn’t matter which gender we choose, as it’s symmetric — we could say the x-axis is the number of male nominees or we could say it’s the number of female nominees), and the percent probability we’ll land on that number of nominees on the y-axis.

Whether we look at the table or the graph, we’re hitting about a 17-18% probability of an even 10/10 split, and it drops off quickly on either side, until a 0/20 split in either direction has almost a 0% probability.

Standard deviation

(I actually found the standard dev first and used that to graph the normal curve, but shhh!  I think it’ll make more sense to non-math people to write it in this order.)

One reason it’s so lovely to talk about standard deviations in a normal distribution is it gives us very pretty ranges that other people who know basic stats can easily grasp — if you say “more than a standard deviation from the mean,” people who know what standard deviation is will have an idea of how hefty a divergence that is.  Here’s a great visualization for standard deviation on a normal distribution:

Standard deviation diagram

By Mwtoews [CC BY 2.5 (], via Wikimedia Commons

As you can see, the dark blue is within 1 standard deviation of the mean and takes up 68.2% of the data.  The lighter blue shows going out another standard deviation from the mean, and the even lighter blue goes out to a third standard deviation from the mean, where the probability of landing is very close to zero.

Standard deviation has a complicated formula that’s beyond the scope of this post — I just used my calculator.  The standard deviation for these data is about 2.236.

For a normal distribution, that means 68% of the data fall within 2.236 of the mean.  In other words, 68% of the data fall within a difference of 2.236 from 10, or between 7.764 and 12.236.

It’s easy to check that this is about right: if we go to our table above and add the “exact probability” column for 8, 9, 10, 11, and 12, we get a bit above 70%.  It’s not exact because our frequency distribution is only approximating the normal distribution, but it’s a very good approximation, and it’s generally considered an appropriate model for binomial distributions with non-extreme probabilities and a reasonable number of trials.[2]

One Standard Deviation, Two!  Three Standard Deviations, More!

Remember that about 68% of the data will fall within 1 standard deviation of the mean, 95% will fall within 2, and 99.7% will fall within 3.  In other words, another advantage of standard deviation is that it gives us some nice arithmetical shortcuts, as follows:[3]

  • 1 standard deviation:  7.764 – 12.236
  • OR: About 68% of the time, the gender split will be 8/12 or closer.
  • 2 standard deviations: 5.528 – 14.472
  • OR: About 95% of the time, the gender split will be 6/14 or closer.
  • 3 standard deviations: 3.292 – 16.708
  • OR: About 99.7% of the time, the gender split will be 4/16 or closer.

And finally:

  • A gender split wider than 4/16 is an extreme outlier.[4]

Note that though a split wider than 4/16 suggests something very statistically unlikely is going on, it does not say why, and it does not assign intent.  My lived experience suggests that intentional sexism should not generally be assumed when systemic bias will suffice, and in a process like writing, publishing, publicity, and awards nominations, there are plenty of stages at which institutional bias can manifest itself.  This does not, of course, mean there is not a problem — in fact, it would mean the problem may be one that requires more thought, awareness, and effort to address.

I’ll further note that if you consider the years 2010-2014 (none of which had fewer than 7 nominations for either gender) and compare them to 2015,[5] and this leads you to conclude (along with a preponderance of other data, I am aware) that something untoward happened in 2015, I’ll further note that even one person or one small group of people with a particular subgenre taste having chosen a fantastically statistically unlikely slant of genders still does not imply malicious sexism.[6]  What it does imply, in my opinion, is a variety of other extremely upsetting problems, exacerbated by the fact that nonmalicious sexism can be much, much harder to combat.

So.  What was the gender split in the writing categories is this year?

  • 3/17.



Footnotes    (↵ returns to text)

  1. Yes, I’m aware there are factors affecting that 50/50 probability, even in years that aren’t this one — potentially factors at every step in the publishing process, not just the nominating-for-awards stage.  This post could be, in that vein, viewed like a proof by contradiction — I’m showing the probabilities of expected fluctuations, and if you’re seeing greater extremes, that might indicate the starting assumption of 50/50 gender blindness at all steps is, in fact, incorrect.
  2. This distribution definitely has a non-extreme p — I tried to figure out if 2o trials is a reasonable number for approximating via a normal distribution and didn’t get anything definitive, although I did compare by hand and the numbers all seemed pretty close.  But if you distrust the model, notice that I’m really only using this one to make relatable statements about the exact raw data that you can look at in the table above — if you want to, you can define your own terms to look at probability ranges by adding the numbers in the third column, and you’ll come to the same conclusions.  In other words, about 70% of the data fall between 8 and 12 whether we use the vocabulary “within one standard deviation of the mean on a normal distribution” or not.
  3. You could, again, find the exact percentages by adding the numbers in the table.  But this is faster.
  4. As far as I know “outlier” doesn’t have a specific statistical definition, but I’ve seen it used to mean “three or more standard deviations from the mean,” so that’s what I’m doing here.
  5. If you do compare, be aware that some of those years had greater or fewer than 20 nominations — presumably because of ties or the 5 percent rule — and I’ve not accounted for those sorts of variations here.  The ideas should be broadly applicable, however, and if we’re speaking roughly, I’ll note that 4 out of the 5 years from 2010-2014 had at least 8 nominations from both genders, and the other year had a 7/11 split, which is perfectly in line with the numbers above: if 4/5 years fall within the 68% (roughly) and 1/5 falls outside the 68% but within the 95%, that’s about what we’d statistically expect.
  6. Well, at least one person involved has nonfiction writings that would support such a conclusion, but I will not extend his philosophies to the rest.

About the author

SL Huang (aka MathPencil)
SL Huang (aka MathPencil)

SL Huang justifies an MIT degree by using it to write eccentric mathematical superhero books. Debut novel: Zero Sum Game, a speculative fiction thriller.
Twitter: @sl_huang


Leave a comment
  • Please be kind in the comments. Yes, even if you’re agreeing with me. I have some RL stuff happening right now that makes high emotions difficult to be around, so I respectfully request that you express those in other places. 🙂

    You’re welcome to tell me my math is wrong (in fact, if it is, please do!), but I would ask that the request of kindness extend to myself as well, even if I’ve made a factual error.

  • Hi, another math geek and bookworm here! (I really enjoyed your Zero Sum Game, btw).

    I have some background in stats, but I’m by no means a full on PhD, so I could be mistaken, but I don’t think you need the normal approximation at all and can just use the binomial distribution directly because:
    You already have the cumulative distribution of the Bin(n=20, p=0.5). Mean and S.D calculations are much easier with the binomial than the normal. Mean=E(x)=n*p=20*0.5=10, Variance=n*p*(1-p)=20*0.5*0.5=5, so standard deviation is sqrt(variance)=sqrt(5)=2.236.
    Reading directly from the cumulative table, P(X<=16.7)=99.871%, so three standard deviations is a 16/4 split. A 17/3 split gives P(X<=17)=99.980%, which, yikes.

    • Yup, agreed, the normal distribution isn’t needed at all — I used it mostly because it’s a familiar frame of reference for people (for some values of familiar, hahaha). And also because I’m REALLY LAZY when it comes to adding up columns of numbers….it did not actually occur to me that I could just look at the cumulative probability, even though that’s exactly what I did in a different way in my comment to Jim Hines’ post. *smacks forehead*

      But yeah, you’re absolutely right, and wow, the exact numbers are even MORE dramatic.

  • I found this from Jim Hines blog. I want to say well done. This is a very good probability analysis, and you followed up with a generous, and IMO, correct conclusion.

    It’s nice to see rigorous methods applied. I have read other analyses of the Hugo awards that lacked even the basic understanding of the difference between historical average and probability. I prefer it when analysis are based on verifiable assumptions, i.e. gender, as opposed to political leanings. Even the caveat that you assumed a binary gender divide, while not ideal, is a data-based assumption (because of the lack of easily found gender distribution numbers).

    I think we’d all benefit from a closer look at our starting assumptions. Cheers and well done.

    • Thank you very much! Yeah, I tried to be clear about what the contradiction of the starting assumptions does or doesn’t say…it’s terribly irritating to me when people use bad stats to support a conclusion not in evidence, even if the bad math would support my general positions (actually doubly so when it would support my general positions!).

      Thanks again for reading 🙂

    • You mean the one where even though women are only 1/3 of submissions, when selecting for quality the result was a 50:50 split for publication:

      “Of the four authors Bella and I have taken on this year – two of them are women”

      That *is* a fantastic article for supporting the use of 50/50 to derive the standard deviation for this analysis.

      • I’ve seen that article used as a justification that more males should be nominated than females due to the number of published works skewing heavily towards the male gender. The distribution of gender among the nominees is exactly equal to the distribution of gender among the general population of published, eligible works if and only if the nomination process were truly randomized. Since we know that the Hugo process is a popularity contest, we know that the nominees are not chosen at random.

        Or, to compare it to the U.S. presidential nominations, based on general population of people who meet the constitutional requirements to be president, we would expect 50% of the nominees to be women IF the nomination process were truly random. It is not. A quick look at the past few elections shows that the total nominees were not divided equally along gender lines. Even though the constitution does not require the president to be male, the nominees skew heavily towards the male gender. So, there exists some subset of the general population that is made of potential nominees. The same is true for the Hugo awards.

        For the set of potential nominees to be exactly equal to the set of published works, it would mean that every single nominator would have read the entire set of eligible published works in that year. I know this is not true because I nominate and I cannot read every eligible work published in a year. None of my friends who nominate do either. (For the record, neither do the ones who do not nominate.) So, there is some function that maps the set of potential nominees onto the set of published works. It would take a rigorous study to determine the variables and weighting factors of that function. Even then, without a consistent panel of nominators year in and year out, it would be very messy.

        This analysis is acceptable because the host’s assumption that the population to draw from is the gender of the author is valid. An author, as defined in a binary gender, is either male or female, which means that the population to draw from is male or female. She assumes a binary distribution because that is what we have statistical numbers for. When randomly drawing a gender from the set of authors, there is a 50% probability. Despite some claims that certain people are more male or more female than others, there is no objective weighting factor to back up these claims in a binary gender. Therefore, a 50% probability is still valid.

        Her assumption simplifies the population down to one characteristic.Therefore, it can only tell us about that subset. It says nothing about the trends of the population that subset is drawn from. Without knowing the exact gender distribution in the population of potential nominees, we cannot weight one gender more than another.

        What her analysis tells us is that the 2015 hugo nominations process produced a statistically unlikely result. Potentially, it could have been an entirely random process that produced a highly unlikely result. BUT the slate voting tactic is known to have taken place, which means we know that the process wasn’t entirely random. The analysis makes no moral or ethical judgments about the process, and I think the host made an effort to say that. The only conclusion to be drawn is that the result was statistically unlikely, and pared with other evidence, we can definitively say that the process was not randomized.

      • There’s a depressing bit from a friend 2010 discussion of women in British SF:

        “If we ask how many British women are publishing original adult science fiction with a major genre publisher in Britain, the answer is pretty bleak: with neither Liz Williams nor Gwyneth Jones having contracts at the moment, I think the answer may be just one writer, Jaine Fenn. [Edit: As of next year, thanks to a change in publisher, Sophia McDougall will meet these criteria; there is also the mysterious RJ Frith.]”

        • Various British editors are notorious for producing anthologies that somehow turn out to have few, or no, women.

          Martin Wisse, I believe, has made a case that the current malaise in British SF can be explained as the result of more than a decade of all the promising women in British SF being driven, actively or passively, from the field.

        • Maybe you can tell me this then (it’s something I’ve been wondering for a while, but never could figure out who to ask), is the general climate towards women in SF/F in the UK the reason why there have never been more than a very few women writing Doctor Who tie in novels? That always struck me as odd given how much of the fanfic community is female, and how in the heyday of Star Trek Novels they were largely by women.

    • The one that’s for publishing in the UK, and at a single publisher at that?

      Strange Horizon’s records of books received by Locus for review shows 1) that UK gender disparity is much greater than US gender disparity 2) that of US books, women are about 45% of authors.

      I certainly understand why the Puppies are so fond of basing their suppositions on an area where the gender disparity is greater, but the Hugo Awards, while having some international character, are at the moment mostly US-driven.

      • I have read through your link.

        “Some of the publications we have included in the main count are US- or UK-specific in their coverage; others cover books published in both countries. We have therefore provided country gender breakdowns as well as the overall count. Of note, this year’s proportion of books by women/non-binary individuals is the lowest recorded in the SF Count to date, both overall (39.9%) and in the US (42.0%) and UK (31.3%)”

        I guess you use numpad for typings numbers. 5 and 2 are just next to each other.

        And as one of those “nefarious” puppies. (Didn’t nominate, not voting, and frankly not caring one way or another for the actual results ) I thank you for the link. The non-puppies have cried invalidity of the data, without really providing anything to back theirs.

        I like data… And with probability of 42%, the low percentage of women does look statistically significant on the 95 confidence interval. As it does with the more universal percentage of 39,9%.

        Of course since only a fractions of books are ever reviewed by Science fiction publications, the above data could be critized on it. Or even that it only includes science fiction.

    • I’m also a mathematician but not a statistician but my comment is not about the math, which looks spot on, but the assumption that p=0.5.

      I know we would LIKE p=0.5 because it would demonstrate equal access to SFF publishing based on gender but is that the case? What is the actual data of the gender of all authors who published SFF in 2014?

      For example, if you ran the numbers on gender of mathematicians you would get some pretty skewed data if one made the assumption p=0.5 (even though it SHOULD be) because the fact is that it is not, and in my opinion that’s a bad thing.

  • So far as I can tell, you start by assuming that the “right” answer is to have a 50/50 male/female split. Then you notice that the split is not 50/50, and you posit that discrimination must be the result.

    So far as I can tell, your math is right, but so what? Your math is just recapitulating the starting assumption. It’s easy enough to tell that more men than women have historically been nominated (and are nominated this year) by just looking at the list of names. Which, in most cases, correspond to binary gender identity. Though historically it helps to know that “CJ Cherryh” and “James Tiptree, Jr.” were women.

    The real question is why it is that more men than women write award-winning science fiction stories. It might be discrimination on the part of publishers or fans. Or it might be gender-based differences in aptitude for or interest in writing science fiction or fantasy. If we found out that more women than men write successful romance novels, would we assume that it was the result of anti-male discrimination? Or is it possible that women are just more into romance novels than men?

    I am not suggesting SF is as skewed male as romance is skewed female. And indeed a perusal of the list reveals quite a few award-winning female sf writers — CJ Cherryh, Lois Bujold, and Joan Vinge all come to mind. But some skew could easily account for the differences we observe.

    • But you’re assuming that the answer is NOT 50%? And so far as I can tell, everyone is basing “Way more men write SF/F” on one article about submissions of one genre (novels) from one branch (UK) of one publisher (Tor). Which article also said that publication tended to be more gender balanced than submissions, but provided no data in that regard.

      If you look at the Locus data for submissions for reviews (again for novels), there is a small disparity where men, but probably not enough of one to sink our host’s numbers. As I said in the first comment, I’m not terribly mathy, but I’m reasonably confident you could run the numbers on 55/45 split, and they wouldn’t be wildly different.

      It would still, most likely show, that a slate of, say (since we have data there), five novels ALL of which were by men what the Puppies originally put forward, incidentally, would statistically unlikely if people were just happening to read what came out with no bias whatsoever.

  • Um, my commas and I would like to collectively apologise for that last paragraph.

    Let’s try, “running with a 45/55 split would still likely show, to take the non-random example of novels (since that’s what the Locus data is for), that a 5/0 gender split would be statistically unlikely if people were just happening to read what came out with no bias whatsoever. The Puppies, incidentally, did not nominate any novels by women.”

Tell us all about it...

Copyright © 2014. Created by Meks. Powered by WordPress.

%d bloggers like this: