This blog is a review of “The Largest Analysis of Film Dialogue by Gender, Ever,” an article and data visualization from The Pudding—a weekly journal of visual essays. The authors of the study, Hanah Anderson and Matt Daniels, analyzed the gender breakdown in approximately 2,000 screenplays to see if more film dialogue is spoken by male characters.
In short, Anderson and Daniel’s study did find that male characters have more dialogue than female characters. However, rather than take this at face value, let’s explore the methodology and results of the study.
This study analyzed the gender breakdown of dialogue in roughly 2,000 publicly-accessible screenplays. Anderson and Daniels write, “We didn’t set out trying to prove anything, but rather compile read data. We framed it as a census rather than a study.”
With each screenplay, Anderson and Daniels then “mapped characters with at least 100 words of dialogue to a person’s IMDB page.” I wasn’t entirely sure what this meant so I did a little more digging. Luckily at the end of the article was a link to a FAQ that addresses concerns about the methodology and data of the study. What I found was that Anderson and Daniels only included characters with at least 100 words of attributed dialogue in their analysis. Then they tried to identify/confirm the remaining character’s gender from IMDb pages—which list the actress or actor who played the role. Or if an IMDb page was unavailable, they used the pronouns used in the screenplay. It is important to note, that the authors eliminated screenplays that were too inconsistent with the film cast listed on IMDb.
LIMITATIONS OF METHODOLOGY
There are some obvious problems when using screenplays to measure the gender divide in film dialogue. For one, films can change significantly from script to screen. Creators rewrite lines, cut lines, add characters, or cast a different gender for a character than was indicated in the screenplay.
Additionally, Anderson and Daniels’s analysis only included characters with over 100 words of dialogue which cut out minor characters. Schindler’s List, for example, features a few minor female characters who speak less than 100 words. Thus, while the film is listed as having “100% of Words are Male” the measurement would actually be closer to 99.5 percent male dialogue.
Anderson and Daniels confess these limitations freely in their article. In fact, the Schindler’s List example is there’s. However, despite admitting some possible errors with individual films, they said they believe their results are still “directionally accurate.” Can they reasonable make this claim?
Additionally, Anderson and Daniels used publicly-available screenplays which could skew the results of the study. While unlikely, it is possible publicly-available screenplays are overly representative of male-dialogue-driven films. Or maybe they have some other bias or confounding variable we are unaware of.
The final results of the study showed that almost 76 percent of screenplays had the majority (60 percent or more) of its dialogue spoken by male characters. That is 1,513 out of 2,000 screen plays.
Anderson and Daniels also looked specifically at Disney and Pixar animated movies—which have been called out before for lacking gender parody—and found 22 of 30 films have male majority dialogue.
Shockingly, in the movie Mulan, Mulan’s protector dragon Mushu, voiced by Eddie Murphy, has 50 percent more dialogue than Mulan herself. I assumed that as the lead of the movie, Mulan would have the most lines, but apparently more dialogue is spoken to or about Mulan than by her.
LIMITATIONS OF RESULTS
This study does not prove that Hollywood is sexist. Yet, the authors never claimed that their study “proves” anything. Daniels points out in the FAQ that this study only sheds a light on one part of film representation: dialogue. While dialogue is an important piece, we cannot make any definitive claims from this data alone. For one, Anderson and Daniel did not factor screen time or context (the ways in which characters are portrayed) into their analysis. Additionally, the article does not contain statistical analysis to check for significance . In short, while the information is interesting, we can not extrapolate or make large generalizations from it. We have to take it for what it is—one slice of a larger pie.