The Moment of Lift: How Empowering Women Changes the World
Gender bias in the news media has gotten a lot of attention lately, and the variations in news coverage caused by gender have been studied from a variety of angles. However, quoting, which is common in all forms of news coverage, has received insufficient attention. Quoted material in the news helps us to look at gender bias in the news from a new perspective because quotes are straightforward, direct, and accurate expressions of the speaker's point of view.
As a result, we ask a new set of questions: Is it true that male speakers speak "louder" than female speakers? Is this caused by news coverage or by the gender itself, if it is true? Is it also possible to derive the portraits of each gender from quotations?
In short, our data story examines not just the preferences of different news outlets when quoting speakers of various genders, but also the impact of various political parties and countries on gender prejudice. In addition, we investigate gender prejudice in self-defined themes and at the linguistic level so as to better understand why bias appears in quotations.
Our research is based on the Quotebank dataset, which covers the period from January 1, 2015 through April 30, 2020. Following the wrangling of the original dataset, we now have:
Because there were a tiny number of sexual minorities in the sample, we categorized them as "others" and limited our study to binary gender representations.
From January 1, 2015, through April 30, 2020, we show three types of gender percentages. (The total number of quotations that appear in the media is referred to as the Occurrence. The number of different speakers in the dataset is referred to as Speaker. And the number of different quotations is represented by the term Quotation.)
At first glance, it's easy to notice that over 80% of the speakers quoted by sources are men, which is consistent with past gender bias research  . You may be persuaded to assume that there are many more males than females on the planet; however, this is incorrect for the entire world's population, or even for citizens of the United States (for both the gender ratio is about 1:1). Interestingly, the percentage of women speakers is around 5% greater than the other two forms of aggregations, implying that despite accounting for about a quarter of all speakers, news outlets mentioned them less. We are still far from gender equality, as evidenced by women's underrepresentation in society and in news articles.
To be more specific, we show the top ten most quoted speakers each year from 2015 to 2019. As we can observe (unsurprisingly), men have dominated the news quotations: Men made up the majority of the top ten quotes.
“According to the Interparliamentary Union, 77 per cent of the world’s parliamentarians are male, and only two out of 193 parliaments (in Rwanda and Bolivia) comprise at least 50 per cent of women.” 
The majority of the data in this dataset (QuoteBank) comes from the news media, where politics is constantly a hot topic. Women, on the other hand, do not hold a dominant position in politics in general. As a result, it's plausible to conclude that the majority of the Top 10 speakers are men, and the number of female quotations is significantly lower than the number of male quotations.
Is the bias still present if we control for possible causation? Let's take a look at Hillary Clinton and Donald Trump. In 2015, Donald Trump's number of quotations exceeded Hillary Clinton's by nearly two to one. However, because they were both candidates in the 2016 US presidential election, the difference in their social media exposure and influence (or even the actual vote outcomes) cannot have such a significant impact on the quotation amount. The only probable source of such bias could be news organizations and perhaps society.
There needs to be a fundamental shift in the way societies view women in government, one that does not see them as mere seat-fillers or stats on a chart, they must be viewed as a vital contributing factor to the betterment of the world.
When it comes to providing background or analysis for reports, journalists have a lot of leeway, and the people who supply it are overwhelmingly male. This is possibly the most direct driver of gender bias in quotations. To verify our conjecture, we choose seven of the most significant global news agencies and study their gender preferences and evolving trends in quotes to perform a more detailed examination of gender bias in news quotations.
In addition, we look into whether the speaker's nationality and political party affiliation introduce gender bias in the quotation. In other words, whether a woman is in a different nation or party has an impact on her ability to speak out.
A gender-equal society would be one where the word ‘gender’ does not exist: where everyone can be themselves.
Obviously, the terms used by males and females may have varying levels of exposure under different news topics. It makes sense to at the gender ratio of quoted speakers under various topics in order to conduct a more detailed analysis. To extract related quotations from the data, we manually select 9 common news topics and define keywords for each of these topics. In addition, we look at the sentiment of statements in three ways (Positive, Negative, and Neutral) to see how they differ between men and women on various issues.
We begin by plotting the percentage of women speakers by month for each topic. As shown on the vertical axis of the chart, these themes encompass a wide range of domains commonly appeared in the news. As can be seen, news media treat male and female voices differently depending on the subject.
For women, quotes related to education, health, lifestyle and entertainment are more prevalent than the average level. Whereas in politics, people, business and sports, women percentages are below the global average as shown in the three pie graphs. To our surprise, under the topic of gender, the percentage of women speaking out is at its highest level, but their voices are just almost equal to men's (still less).
Looking at the topics with a mostly masculine quote count, it is easy to see that these fields tend to be dominated by men. Take sports for example, according to the Forbes 2021 ranking of the top 50 highest-paid athletes, the highest-ranked woman: Naomi Osaka, is only 12th. And its annual revenue of $60 million is only 1/3 of the premier sports star Conor McGregor.
To summarize briefly, female quotation sources are more often in caregiving roles whereas male quotations are closely related to sporting and business fields. This reflects the position of women as caregivers but men as sports mania and breadwinners.
Let’s first look at gender. You can notice that for certain painful topics, such as "abortion" and "sexual harassment", men and women tend to express a negative feeling. This may mean that these social problems have not been significantly improved or resolved in society.
How about entertainment? Interestingly, women tend to be positive about "art", while men are not (40.53% are negative). But when it comes to "music", only 9.46% of men are negative. It seems that males prefer "music" more than "art".
The most popular keyword in sports for males is “football”, extremely larger than others, especially for “swimming”, which is consistent with our normal perceptions. When we talk about football games, the first image should be that of male athletes running on the court. For women, they prefer “tennis” more (even greater than such a common and popular topic, “football”). Put another way, football is considered more "masculine" than tennis.
In our conclusion, for certain specific topics, such as "art" and "tennis", we can indeed see a noticeable difference. However, not what one would expect, for the majority of common subjects men and women show a similar attitude distribution.
It is difficult for a woman to define her feelings in language which is chiefly made by men to express theirs.
We already know that men and women tend to display dissimilar preferences towards different subjects (from heat map and sentiment analysis). Now we ask ourselves: can we predict gender (i.e. male or female) when we know the quotations?
If the quotation style discrepancy between males and females is not as significant as we expected. Then the model we developed would be equivalent to a random assumption (we cannot infer gender from what they say), i.e. 50%. But if the model works better than 50%. it indicates that there is indeed some bias in quotes made by different genders, which is therefore identified by the model as a discriminatory feature.
A logistic regression model is chosen for this is a binary-classification task. And we label females as 0, males as 1. Hence, large parameters indicate the word is male-oriented.
We randomly select 1 million quotations for each gender sequentially from 2015 to 2020, constituting a balanced training data of 12 million quotations.
We use the bag-of-words model and TF-IDF to convert the quotation into vectors and select the top 2,000 most frequent words as features to make predictions.
In our task, R-Square is 59.65%. As discussed above, this model does have some capacity for interpretation. Let us look at these most predictable words and see what we get.
The figure below shows the top 50 most predictable words for males and females respectively. These words reflect how some topics are more likely to appear in an overview given the gender. Here we can see that the representative words for male quotes are dominated by sports and business-related words, which is in line with our previous finds in Heatmap. Therefore, it confirms our observation: men are portraited as "ambitious" and "active" while women are more related to family.
Top 50 predictable words for male
Top 50 predictable words for female
In fact, in a long time period, these areas (e.g. business, sports, politics) tend to be governed by men, where females are already outnumbered, let alone the reflection in quotes. For women, these are more their own characteristics and life-like words, which is also consistent with our earlier analysis. It mirrors the reality of women in the social division of labour: to be a good wife, a good mother, and to concentrate on life rather than dominate a particular industry.
I raise up my voice — not so that I can shout, but so that those without a voice can be heard... We cannot all succeed when half of us are held back.
A lot of people may think nowadays women's social status is on par with men's, even if not equal to, only slightly lower than men's. However, through the analysis of quotations, we have proof of the belief that gender bias does not only exist only in society, but also in gender itself. In general, gender bias in the quotation can be summarized in the following three areas:
➤ Overall, the sound of women is much weaker than that of men, although small progress does exist.
➤ The factors causing gender bias are not as significant as we might expect from either news outlets or nations, problems hidden in the whole world (not occurring in just one country) might be the most possible reason.
➤The sentiment distributions are similar between men and women in most topics. Nevertheless, at the social level, people expect some words to be gender specific implictly or explicitly. Females are anticipated to be a “home-carer” for children and husbands.
Gender bias still exists now, in retrospect. It is not only the responsibility of journalists to eliminate gender bias in quotations, but it is also the obligation of each of us to take it seriously so that women could have more diverse identities and other possibilities. As a result, women's voices can be quoted in the news alongside men's voices.
Msc Cyber Security
Msc Data Science
Msc Data Science
Msc Data Science
REFERENCES  Quotebank: A Corpus of Quotations from a Decade of News  A Large-Scale Test of Gender Bias in the Media  Measuring Gender Bias in News Images  The Gender Gap Tracker: Using Natural Language Processing to measure gender bias in media  Political Leadership in the Media: Gender Bias in Leader Stereotypes during Campaign and Routine Times  BBC: 50:50 The Equality Project  UNITED NATIONS DEVELOPMENT PROGRAMME Human Development Reports  DEMOCRATIC PARTY PLATFORM  Forbes: HIGHEST-PAID ATHLETES