In Which I Do Math on Gender, Again

Sometimes I see a “top X” list that’s, shall we say, all male (think lists of top scientists, recommended SFF authors, etc.).  And when people object, others defend against the objection with, “But what if the field’s mostly male???”

There are a whole host of problems with this, but I’m not really going to get into them here.  I’m just going to do some math.

As far as I can figure, my starting assumptions are only that (1) we expect Top-X lists to sample gender randomly — that is, that a male person in the field is not automatically expected to be better than a non-male person already in the field, and (2) there is no institutional sexism going on beyond whatever might cause the gender skew in the first place.

If we see unlikely Top-X lists, one of these assumptions must be wrong.

Let’s look at a Top-10 list.

The Approximate Probability of an All-Male Top-10 List

If a field is 50% male, the likelihood that a Top-10 list will be entirely male is .098%.

If a field is 60% male, the likelihood that a Top-10 list will be entirely male is .60%.

If a field is 70% male, the likelihood that a Top-10 list will be entirely male is 2.8%.

If a field is 75% male, the likelihood that a Top-10 list will be entirely male is 5.6%.

If a field is 80% male, the likelihood that a Top-10 list will be entirely male is 11%.

If a field is 90% male, the likelihood that a Top-10 list will be entirely male is 35%.

I note that even in the most extreme case — 90% male is a VERY extreme gender skew — only about 1/3 of Top 10 lists would be expected to be composed entirely of men.

This math is very easy, by the way, and you can replicate it quickly for any Top-X list and any percentage of men.  If m is the percentage of men written as a decimal, just raise m to the power of X.  So to find the likelihood a Top-25 list is all-male if you suspect a field of being 3/4 male, you would do (.75)^25 (which incidentally equals .075% — in other words, it’s EXTREMELY unlikely for even a field that is 3/4 male to have a Top-25 list that is all-male).

Whether any skewed percentages are a result of other biases in the first place is, of course, another discussion.  But if you find yourself with extremely probabilistically unlikely Top-X lists even given skewed percentages, then maybe it’s worth thinking about why that might be.  And if the “you” in question is a magazine, bookstore display rack, publisher’s promo list, other Official Book Industry Recommendation List, review blog, fanzine, etc. . . . it might be worth urging your staff to be somewhat less unlikely.

Math note: I’ve sampled with replacement here, on the assumption that the field is big enough relative to X that removing up to X people for the list has not changed the gender ratio among the population of people not on the list. If you have a (relatively) small field or a large list, the math becomes more complicated.

Comments are open, but I may not have time at the moment to respond (I still haven’t caught up on the comments for my LAST gender and math post, argh I am the worst!). Comments will still be moderated if necessary — please be kind to each other.

eta: Even though I triple-checked, I made a copy-paste error — the 50% line initially read .0098% instead of .098%. SORRY!

About the author

SL Huang (aka MathPencil)
SL Huang (aka MathPencil)

SL Huang justifies an MIT degree by using it to write eccentric mathematical superhero books. Debut novel: Zero Sum Game, a speculative fiction thriller.
 
Website: www.slhuang.com
Twitter: @sl_huang

7 Comments

Leave a comment
  • If I may quote you here:

    This math is very easy, by the way, and you can replicate it quickly for any Top-X list and any percentage of men. If m is the percentage of men written as a decimal, just raise m to the power of X. So to find the likelihood a Top-25 list is all-male if you suspect a field of being 3/4 male, you would do (.75)^25 (which incidentally equals .075% — in other words,

    OY. Tell you what, pencil, I need them other woid thingies. But I THINK I caught the gist of what you are saying: somethin’s fishy in Denmark der. Something is askew. Math don’t lie. Probabilities and statistics are quantifiable and now kk is talking out her arse. I ain’t a math person, dang it. But I do remember reading something about the Edgar Awards this past year, and how the field for Best Novel was all men, not one woman in the bunch, and I thought, WAIT. Surely there were some great novels written by women who were up to snuff, right? Surely one or two, at least, deserved to be contenders. Which would have, by default, at least given them a shot.

    Maybe it wasn’t the Edgars. Poo.

    Regardless, as I said: math don’t lie. I don’t pretend to understand how you are showing what you are showing, nor even WHAT you are actually showing in your precise and amazing way, pencil. But my spidey senses are pricklin’ and a’cracklin’, and they are telling me, something doesn’t add up, and THAT’s the point.

    Right, pencil?

  • Verrah thought-provoking. In fact, I had a very deep discussion about it with Mike. 😀 (He checked your math just to be sure, cause “she’s only from MIT”.)

  • Okay, so… let’s look at the flip side of this. If we can observe a sample of Top-x lists, determine the proportion that are all male, then we can work out the proportion of males in that field. If I’ve done my maths correctly…

    For a Top-10 list:
    50% of lists being all male = 93% men in industry
    75% of lists being all male = 97% men in industry
    90% of lists being all male = 99% men in industry
    99% of lists being all male = 100% men in industry (actually, it’s 99.8995%)

    It gets worse for a Top-25 list:
    50% of lists = 97% men in industry
    75% of lists = 99% of men in industry

    I would be fascinated to see if anyone has a reasonable sample of Top-x lists to analyse (barring Top-10 Women in Business, which is an entirely different issue)

  • Very cool! This kind of mathematical explicitness makes it really clear that even in a skewed field, an all-male list probably reflects a skewed view of that field.

    One thing I’m wondering about is whether the difference between a Top-X list and a random sample of X matters. (This is based on stuff I’ve seen/read about there being more male outliers than female ones in various areas.) Suppose that whatever kind of greatness you’re selecting for has something like a normal distribution (and that the shape of the distribution is the same for both* genders). If we take the “top” 10 individuals (from the high end of the bell curve) in a field consisting of 30% women and 70% men, are we more likely to end up with 10 men than we would have been if we had picked at random? (I think we might be, but I don’t actually know enough math to work it out for myself with any real confidence.)

    On the other hand, once we try to bring into the picture whatever quality “Top-10” lists are trying to reflect, there are a lot of other potential complications that don’t lend themselves as easily to mathematical modelling. If a field is strongly skewed towards men, then perhaps a woman has to be very, very good to break into it at all—in which case the assumption that the distribution is the same between* genders is unwarranted. Perhaps, in fact, we should expect a much higher proportion of the underrepresented gender among the best exemplars of the field than we find in the field as a whole.

    *I’m treating gender as binary for purposes of mathematical simplicity only.

  • “(1) We expect Top-X lists to sample gender randomly — that is, that a male person in the field is not automatically expected to be better than a non-male person already in the field.”

    The part of the above statement after the dash is not quite the same as the part of the above statement before the dash.

    Suppose that both the men and the women in a field form normal distributions, with peaks at the same points. So this meets the requirement of the part after the dash–a randomly chosen woman has a 50% chance of being better or worse than a randomly chosen man.

    But if the SDs for the distribution of men is higher than that SD for women, we could still expect top ten lists (and bottom ten lists!) to be composed mostly of men. Etc. Like what Q. said re. outliers.

Tell us all about it...

Copyright © 2014. Created by Meks. Powered by WordPress.

%d bloggers like this: