Rhythm may be one of the topics better suited to computational study. Variations in the lengths of sentences in prose are not complicated to detect and analyse, although it is no trivial task.
Scanning thousands of lines of poetry is also within reach—just as an understanding of narratives is well-developed, as discussed previously—and contributes to an understanding of the rhythm of a text.
The practical benefit of the understanding of rhythm is another thing. To show that Shakespeare's sonnets follow the pattern that we assume they do does not reveal much. But seen in the perspective of all known sonnets in English literature may show more of the development of the form.
The challenges become even more interesting in modernist poetry: William Carlos Williams' hypercanonical poem "The Red Wheelbarrow" is often presented as one of his quintessential poems. But is it? The use of four couplets, each with a short second verse, does not show up that often in his work, and it very probably is the only one with this particular form of four times two lines. A computational analysis would provide proof.
Prose rhythm is more uncharted territory, but no less interesting or open to analysis. The identification of variations in sentence and paragraph length, of shorter and longer words, and even of alliteration, have probably been seen as pointless, and too demanding to carry out by hand - but with computers, it may be done relatively easily, even for large corpora.
Detecting variations in the lengths of sentences within a text is not the only way to study how keywords shape the universe of a text. The rhythm of a text can also be studied by using graphic tools such as Voyant Tools to see how particular words are used, co-occur (or the opposite; don’t occur together) across e.g. a novel or an authorship. For this particular purpose, Voyant Tools’ Key Word In Context (KWIC) concordance function is useful as it allows you to investigate co-occurrences of words or characters throughout a text or a larger corpora.
The use of asyndeton structure within a text also contributes with a feeling of movement and rhythm to the narrative. Virginia Woolf is a great example of a writer using asyndeton structures to create rhythm within a text. The omission of conjunctions in short sentences can give a feeling of an accelerating pace of a narrative, but can also help create a feeling of balance in a narrative when used in longer sentences. Conversely, polysyndeton, the deliberate use of many conjunctions, can slow down the rhythm of a sentence or phrase. Both asyndeton and polysyndeton thus contribute to the creation of a rhythmic effect within a text. A computational approach can therefore be to investigate the appearance (or lack of) conjunctions within a text either by using an annotation tool such as Annotation Studio, online visualisation tools such as Voyant, or even a simple online conjunction checker that can detect conjunction words within fractions of text.
When combined, computational tools allow for research that would have previously been out of the human scope: machines can learn to detect patterns in large corpora, but the identification of relevant rhythm figures to look for and the interpretation of the results is still dependent on human interpretation. A good example of a large-scale study was carried out by Lagutina et al. (2020). They built algorithms to trace changes in prosaic rhythm over the 19th-21st centuries, comparing British and Russian literature. A literary analysis of this scope would not have been possible without computational tools, and the statistical findings can provide evidence for the linguistic intuitions about changes in rhythm and metrics over time. All their code is available on Github if you want to deepen your understanding on the computational approach or explore rhythm in your corpora.
In poetry, Heuser, Falk, and Anttila have done pioneering work and developed Prosodic, a metrical-phonological parser already in 2010. Prosodic first tokenizes text and converts it into a stressed, syllabified, phonetic transcription. Based on metrical constraints, it then tries to match each line of text with a metrical parse. The current program supports English and Finnish, but it is possible to expand it to other languages if you have a pronunciation dictionary or a custom function for it. Read more about how Anttila and Heuser (2016) used it in the study of metrics, and download Prosodic through their GitHub to conduct your own analysis. For a less advanced approach, you can install Poesy, which is built on prosodic, that allows for analysing rhythm features of poems with very user-friendly functions.
Prosodic is a metrical-phonological parser written in Python for poetry, code available on GitHub.
Poesy is a simple tool for poetry processing in Python, based on Prosodic.
ProseRhythmDetector is a tool developed in Python to extract rhythm features and compute stylometric features for texts.
Ryan Heuser’s home page.
Annotation Studio, collaborative, web-based annotation tools.
Voyant Tools, a web-based reading and analysis environment for digital texts.