Women’s Books versus Books by Women

Corina Koolen (corina.koolen@huygens.knaw.nl), The Huygens Institute for Netherlands History, Amsterdam

1. Introduction

Books written by and marketed towards women have been analyzed mostly in the context of popular culture ( Radway, 1987; Hollows, 2000; Modleski, 2008). In literary criticism however, fictional work by women is regularly held up to such ‘women’s novels’ to measure the quality (van Boven, 1992; Vogel, 2001; Groos, 2011). This connection made between female author gender and popular feminine novels is likely based on bias, but it is not yet well-researched in computational stylistics. In this paper we present a pilot study for examining this potential bias, through the combination of a reader survey and text analysis.

2. Related work

Although computational stylistics is now quite common in analysis of fiction (i.e. Semino and Short, 2004), ‘women’s’ genres are not researched often in relation to literature. Jautze et al. (2013) focuses on differences between the syntactic make-up of sentences in literary novels and so-called ‘chick lit’ (cf. Ferriss and Young, 2013); Montoro (2012) performs computational-linguistic analysis on chick lit as opposed to a BNC sampler corpus – but not to literary fiction specifically.

3. Women’s books

What is the relationship between books by women and ‘women’s books’ according to readers? We examine this through results of the National Reader Survey (2013). Respondents were supplied with a list of 401 recent Dutch-language novels (translated and originally Dutch, published between 2007-2012) that were most often loaned from libraries and bought from bookstores between 2009-2012 (Koolen et al., in preparation). 1 , 2 Respondents supplied ratings of literary quality on books they had read (on a scale of 1-7) and were allowed to motivate one of their scores.

Overall, works by female authors are judged to have lower literary quality (M=3.92, SD=0.81) than those by male authors (M=4.73, SD=1.04); t(344)=-8.34, p < 0.01. This is partially caused by romantic novels, which are mainly written by women (M=3.02, SD=0.60). 3 More surprisingly, within general fiction female authors’ works scores’ (M=4.55, SD=0.84) are significantly lower than for male’s (M=5.53, SD=0.73); t(120)=-7.60, p<0.01.

An analysis of the motivations shows that the concept of the ‘women’s book’ (‘vrouwenboek’) and similar gendered terms are used dozens of times to explain what literary quality is not; a male equivalent is mentioned twice (‘men’s book’, ‘boy’s book’). Examples of novels referred to as ‘women’s’ book’ are translations of Eat, Pray, Love by Gilbert (general fiction), Remember Me? by Kinsella (romantic fiction) and The Ice Princess by Låckberg (suspense). Thus, works by female authors are equated with ‘women’s books’ regardless of the novel’s own genre. Perceived connections that respondents provide are: bad story (about love), a simple style, no deeper layers, etc.. But how much do ‘women’s books’ differ from novels that are perceived as literary? And are they more strongly connected to other female-authored novels than to male-authored ones?

4. Text analysis

We perform two experiments as a first exploration. We compare present-day romantic novels by female authors (R), predominantly chick lit, to general fiction by women (GF) and general fiction by men (GM). We select the lowest scoring novels in the romantic genre and the highest in the general fiction genre (i.e. the most ‘literary’ ones according to our respondents), to find the clearest contrast (cf. Table 1). We use only one novel per author, unless the author uses a different pen name (Kinsella/Wickham).

Table 1. Division of books in the sub-corpus
Genre / gender author (av. rating literariness) Transl. from EnglishOriginally Dutch
Romantic / female (2.8)102
General fiction / female (5.2)102
General fiction / male (5.9)102

4.1. Experiment 1: style

As we have shown, the style of ‘women’s books’ is seen as inferior. We use stylometric analysis to explore this notion, adding Gilbert’s Eat, Pray, Love to this experiment (cf. Section 3); a hybrid of general fiction and romance. Stylometric analysis is most often used to perform authorship recognition, but has been successfully applied to identify gender (Rybicki, 2015) and fictional genres (Allison et al., 2011). We apply the method detailed in Eder (2017). First, with R-package Stylo (Eder et al., 2016), we construct a bootstrap consensus tree based on the 100 through 1,000 most frequent words with 100-word intervals, using Classic Delta to calculate stylistic similarity (cf. Eder, 2017). Second, we use network analysis and visualization tool Gephi to visualize the novels’ connectedness (Bastian et al., 2009). Color-codes are based on modularity, which visualizes groupings of greater inner coherence (Blondel et al., 2018). Finally, we apply the ForceAtlas2 algorithm to make groupings more visually distinct.

Network visualization of the novels’ stylistic proximity (R = romantic, GF = general fiction/female author, GM = general fiction/male author). Colors indicate groupings based on modularity
Figure 1. Network visualization of the novels’ stylistic proximity (R = romantic, GF = general fiction/female author, GM = general fiction/male author). Colors indicate groupings based on modularity

Fig. 1 shows six clusters. Part of the romantic novels (blue, soft pink) are indeed separated from the general fiction (other colors); Stockett’s The Help is stylistically connected strongest to romantic novels. General fiction by female and male authors hardly form clusters of their own. Except for one ‘male’ cluster which contains a Barnes’ novel and an outlier: Gilbert’s novel – which is seen as a ‘women’s novel’ by our respondents. Weiner, known for opposing the ‘chick lit’ label to her work (Mead, 2014) has a stronger connection to general fiction. In other words, stylistically seen, part of the romantic novels appear to have a specific signature, but most novels by female authors are not obviously stylistically connected to them.

4.2. Experiment 2: sentiment

We now use Linguistic Inquiry and Word Count (LIWC), a word list analysis tool, which has a dictionary for Dutch (Boot et al., 2017) and has been applied to literary fiction in genre analysis (Nichols et al., 2014). LIWC contains a number of content and sentiment-related categories that are of interest. Attention to physical appearance, a (heterosexual) love story, work and friendship and have been identified as themes of chick lit novels (Gill and Herdieckerhoff, 2006), which are the main component of the romantic genre in this corpus. We report significant differences on salient categories in an independent t-test between averages of groups (p < 0.01).

Table 2. Significant differences (p < 0.01) between groups
LIWC category Romantic-Gen. Female Romantic-Gen. Male Gen. Female-Gen. Male

Table 2 shows that romantic novels differ from general fiction in some ways: more positive emotions, but no significant difference in negative emotions, more words pertaining to friendship. The romantic novels differ in other ways from either the female or the male-authored literary novels: there are more job-related words in the romantic novels than in female-authored general fiction; less articles and prepositions than male-authored general fiction. Female-authored literary novels and male-authored ones do not significantly differ on any category. This might indicate that when comparing literary fiction to romantic novels, readers choose to focus on commonalities with female authors and differences with male authors, whereas differences between female authors and commonalities with male authors are overlooked. However, we need to be careful with interpretations of t-tests in LIWC (cf. Koolen and van Cranenburgh, 2017). Additional analysis will need to be performed to identify within-group differences. Finally, physicality and the body do not appear to be specific to romantic novels. This finding corroborates earlier research, see Montoro (2012) and Koolen (2018).

5. Conclusion

Romantic novels appear to be more different from all general fiction than the general fiction differs among authors of female and male gender. They contain signature elements, albeit not all the expected ones (positive emotions and friends, not attention to appearance). Part of the romantic novels are clearly different from general fiction stylistically, but a number of them cluster with male-authored general fiction; most notably work by Gilbert and Weiner. Although further testing is needed, they show that computational stylistic analysis might be used to paint a more objective picture of the actual style of contemporary novels by female authors and the relationships between them. We offer a speculation: if we consider the romantic novels in this corpus to be ‘women’s novels’, there are a several indications that commonalties between female-authored general fiction and romantic novels are stressed heavily and this might be a reason female authors’ novels are judged to have less literary quality. Nevertheless, we do not aim to assert ‘low’ literary quality of the romantic novels, either. To examine gendered quality perceptions further, we will include other fictional genres in future research.

Appendix A

  1. Allison, S. D., Heuser, R., Jockers, M. L., Moretti, F. and Witmore, M. (2011). Quantitative formalism: an experiment. Stanford Literary Lab https://litlab.stanford.edu/LiteraryLabPamphlet1.pdf (accessed 27 November 2017).
  2. Bastian, M., Heymann, S. and Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media, 8: 361–62.
  3. Blondel, V. D., Guillaume, J.-L., Lambiotte, R. and Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10): P10008.
  4. Boot, P., Zijlstra, H. and Geenen, R. (2017). The Dutch translation of the Linguistic Inquiry and Word Count (LIWC) 2007 dictionary. Dutch Journal of Applied Linguistics, 6(1): 65–76.
  5. Eder, M. (2017). Visualization in stylometry: cluster analysis using networks. Digital Scholarship in the Humanities, 32(1): 50–64.
  6. Eder, M., Rybicki, J. and Kestemont, M. (2016). Stylometry with R: a package for computational text analysis. R Journal, 8(1): 107–21.
  7. Ferriss, S. and Young, M. (2013). Chick Lit: The New Woman’s Fiction. New York: Routledge.
  8. Gill, R. and Herdieckerhoff, E. (2006). Rewriting the romance: new femininities in chick lit?. Feminist Media Studies, 6(4): 487–504.
  9. Groos, M. (2011). Wie schrijft die blijft? Schrijfsters in de literaire kritiek van nu (Who writes remains? Female writers in today’s literary criticism). Tijdschrift Voor Genderstudies, 3(3): 31–36.
  10. Hollows, J. (2000). Feminism, Femininity and Popular Culture. Oxford: Manchester University Press.
  11. Jautze, K., Koolen, C., van Cranenburgh, A. and de Jong, H. (2013). From high heels to weed attics: a syntactic investigation of chick lit and literature. Proceedings of the Workshop on Computational Linguistics for Literature. Atlanta, GA, USA: Association for Computational Linguistics, pp. 72–81.
  12. Koolen, C. (2018). Reading Beyond the Female: the Relationship between Perception of Author Gender and Literary Quality. Amsterdam: University of Amsterdam.
  13. Koolen, C. and van Cranenburgh, A. (2017). These are not the stereotypes you are looking for: bias and fairness in authorial gender attribution. Proceedings of the First Workshop on Ethics in Natural Language Processing. Valencia, Spain: Association for Computational Linguistics, pp. 19–29.
  14. Koolen, C., van Dalen-Oskam, K., van Cranenburgh, A., Nagelhout, E. and de Jong, H. (in preparation). Literary quality in the eye of the Dutch reader: the National Reader Survey and its results.
  15. Mead, R. (2014). Written off: Jennifer Weiner’s quest for literary respect. The New Yorker https://www.newyorker.com/magazine/2014/01/13/written-off (accessed 27 November 2017).
  16. Modleski, T. (2008). Loving with a Vengeance: Mass Produced Fantasies for Women. New York: Routledge.
  17. Montoro, R. (2012). Chick Lit: The Stylistics of Cappuccino Fiction. London, New York: Bloomsbury Publishing.
  18. National Reader Survey (2013). Het Nationale Lezersonderzoek. https://www.hetnationalelezersonderzoek.nl/ (accessed 26 April 2018).
  19. Nichols, R., Lynn, J. and Purzycki, B. G. (2014). Toward a science of science fiction: applying quantitative methods to genre individuation. Scientific Study of Literature, 4(1): 25–45.
  20. Radway, J. A. (1987). Reading the Romance: Women, Patriarchy and Popular Literature. London: Verso.
  21. Rybicki, J. (2015). Vive la différence: tracing the (authorial) gender signal by multivariate analysis of word frequencies. Digital Scholarship in the Humanities, 31(4): 746–61.
  22. Semino, E. and Short, M. (2004). Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing. New York: Routledge.
  23. van Boven, E. (1992). Een Hoofdstuk Apart: ‘Vrouwenromans’ in de Literaire Kritiek 1898-1930 (A Separate Chapter: 'Women’s Novels’ in Literary Critique 1898-1930). Amsterdam: Sara/Van Gennep.
  24. van Dalen-Oskam, K. (2016). ‘Could be the translation, of course’. Analysing the perception of literary fiction and literary translations. Digitalität in Den Geisteswissenschaften. Loveno di Menaggio, Italy.
  25. Vogel, M. (2001). ‘Baard Boven Baard’: Over Het Nederlandse Literaire En Maatschappelijke Leven 1945-1960 (’Beard over Beard’: On Dutch Literary and Societal Life 1945-1960). Maastricht: Maastricht University.

Note that the Riddle corpus’ novels show the one-sidedness of the market: it consists of few genres, there are very few novels by people of color, it contains mostly European and North-American novels.


The factor of translation will be taken into account in further development of this pilot, for information on effects within the larger project, see van Dalen-Oskam, 2016.


To distinguish genres, we roughly base ourselves on Dutch publishers’ assignments of genre, which is done through a uniform classification system in the Netherlands.