The Narrator


Who speaks in literature?

This is a question that is more relevant to prose than other genres. Drama makes the speaker very explicit by attributing each line to a character (although stage remarks may make things more difficult), and the idea of a narrator may be applied to some poetry, such as Walt Whitman’s, but the concept is less used in analysis.

In prose fiction, there is a whole field of understanding regarding enunciation and the perspective of narration, stretching from the real author to the imagined reader.

A computational approach to the narrator will often fall short in catching subtle changes in narration, such as in Kurt Vonnegut's novel, Slaughterhouse-Five, where Vonnegut’s apparent alter ego, Billy Pilgrim, is suddenly revealed as having been present alongside the author, and that Pilgrim’s experiences should not be read too directly as Vonnegut's memories. Unless that is a trick the author plays on the reader. These points are easily captured in close reading but would elude any known computational approach.

However, there are other ways of approaching narrators that may generate insights that traditional reading cannot. For example, one is to characterize the style of narration, for example understood in the balance between narration and conversation or free indirect style as Marissa Gemma has done with regard to the 19th-century novel, showing that the genre of the novel used more and more free speech. Others have shown (see referenced articles) how the amount of speech assigned to males and females in a very large corpus of novels has developed over a period of two centuries, with the far from intuitive conclusion that the representation of the female voice has not increased. Such investigations are not easy to carry out, but they yield results that are eminently useful to other researchers who are trying to understand the long-term trends of narration.



The characters present in the story at a specific point affect what kind of narrative perspective the narrator adopts. Explore the novels analyzed by Bilenko and Miyakawa to see which characters are present in every chapter and how they contribute to the sentiments of the story. Combined with close-reading the novel, what can you conclude about the narrator’s point of view and situatedness in the story?

For an elementary computational approach, count the number of personal pronouns in a novel to detect whether first- or third-person narrator dominates the story. Upload your corpus to AntConc and create concordance plots for different personal pronouns. Another way is to look at collocations as words that tend to co-occur in a text. Generate word collocate lists for personal pronouns to explore which characters are referred to with them. You can also use Voyant Tools to visualize the use and distribution of pronouns in your corpus.


Building on the concept of character space (see ch. “Character”), one way to computationally approach narrators is to annotate text in ways that assign parts of text to different characters in it. These annotations can be used as a proxy for exploring what kind of narrative perspectives dominate the story. Annotations can be made manually thanks to a wide range of annotation tools available (see resources), but there also exists tools that create annotations for text bodies computationally, such as David Bamman’s BookNLP pipeline. 

One computationally challenging step in text pre-processing is determining coreference resolution, i.e., finding and connecting all expressions that refer to the same entity in a text. Another challenge is to separate among characters of the same author, since each author often has their own characteristic style and therefore their characters might have a common authorial voice, as discussed by Bamman (2014). 

Use David Bamman’s BookNLP pipeline available on his GitHub to create annotations for a text of your choice. The preprocessed output can be used to represent the relationship between character and narrative form. Analyze the narrator by exploring the presence of each character throughout the story. 

See also Brooke, Hammond & Hirst’s (2016) article about quantifying free indirect discourse in Woolf’s and Joyce’s work as a mixture of narration and direct discourse, indeed requiring an interdisciplinary approach. Use POS-tags and extract personal pronouns used throughout the text to analyze the narrative perspective.


Scripts and sites


  • Bamman, D., Underwood, T., & Smith, N. A. (2014, June). A bayesian mixed effects model of literary character. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers) (pp. 370-379). 
  • Brooke, J., Hammond, A., & Hirst, G. (2017). Using models of lexical style to quantify free indirect discourse in modernist fiction. Digital Scholarship in the Humanities, 32(2), (pp. 234-250). 
  • Elson, D., Dames, N., & McKeown, K. (2010, July). Extracting social networks from literary fiction. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 138-147). 
  • Gemma, M., Glorieux, F., & Ganascia, J. G. (2017). Operationalizing the colloquial style: Repetition in 19th-century American fiction. Digital Scholarship in the Humanities, 32(2), (pp. 312-335). 

  • Underwood, T., Bamman, D., & Lee, S. (2018). The Transformation of Gender in English-Language Fiction. Journal of Cultural Analytics.

Style »