Resources presented in this website

Navigate resources

To facilitate the navigation through resources presented over the topics, we've gathered here lists of the most important resources, grouped by theme.

Annotation tools

Annotation Studio, a collection of web-based and collaborative annotation tools.
Antconc A free software for corpus linguistic analysis, including collocation and concordance analysis.
Brat, an online environment for collaborative text annotation.
CATMA,a computer-assisted text annotation and analysis tool.
PRISM, a tool for crowdsourced interpretation of text created by Scholarslab.
the UAM CorpusTool, an open-source text corpora annotation tool.

Wmatrix for corpus analysis and comparison, offering annotation tools and standard corpus linguistic methodologies.

Corpora and databases

GoodReads Datasets, collected from GoodReads in 2017.
HathiTrust, a collection of millions of titles digitized from libraries around the world.
Internet Archive, a non-profit digital library, having a collection of over 20,000,000 books and texts freely available.
NovelTM dataset for English-Language Fiction, 1700-2009.
Post45 Data Collective, peer reviewed post-1945 literary data on an open-access website designed, hosted, and maintained by Emory University’s Center for Digital Scholarship.

Project Gutenberg, a free eBook library.

NLP libraries, packages and tools

Apache OpenNLP, a machine learning based toolkit for text processing.
BookNLP by David Bamman, a natural language processing pipeline for longer texts.
Dariah, A Library for Topic Modelling and Visualization.
Lancaster Stats Tools online, materials and tools for a statistical corpus analysis.
MALLET, a Java-based package for statistical language processing.
Stanford CoreNLP, a natural language processing pipeline created by the Stanford NLP group.
SpaCy, an open-source library for natural language processing in Python.
TAPoR 3.0, a curated list of research tools for studying and analyzing texts.

Voyant tools, a web-based reading and analysis environment for digital texts.

Computational analysis handbooks and related materials

Digital Humanities - A Primer, an online handbook for digital humanities.

Quantitative Intertextuality, supplementary material and R exercises for the book.
HathiTrust Research Center Analytics, a platform to facilitate research and large-scale computational analysis of the works in the HathiTrust Digital Library.

Mapping Metaphor, a comprehensive analysis and an interactive resource of metaphors in the English language.

Digital Humanities projects and groups

Computational Stylistics Group is a cross-disciplinary research group focused on computational text analyses, particularly stylometry.
Digital Humanities Lab, Yale University.
Group for Experimental Methods in Humanistic Research, Columbia University

Stanford Literary Lab, a research collective applying computational methods to literary studies.

Tutorials

The Art of Literary Text Analysis, a Jupyter Notebook series by Stefan Sinclair.
Humanities Data Analysis: Case Studies with Python, an online guidebook to digital humanities research in Python by Folgert Karsdorp, Mike Kestemont and Allen Riddell.
DH Tools for Beginners, a collection of tutorials written for digital humanities novices.
Programming Historian, beginner-friendly and peer-reviewed tutorials for digital tools in the field of Humanities.
Research on characterization by Ted Underwood.

Stylometry in R with the ‘stylo’ package in a nutshell.

Visualization

Dramavis, a tool for visualizing and calculating literary network data by Frank Fischer.
From Data to Viz, a guide for producing the right kind of graph from your data.
Gephi: A free software for graph and network visualization and analysis.

R Graph Gallery, R chart examples with code.
Voyant tools, a web-based reading and analysis environment for digital texts.

Text reuse and intertextuality

TRACER machine, a powerful and flexible suite of some 700 algorithms for the automatic detection of (historical) text reuse.

Tesserae, a collaborative project that aims to provide a flexible and robust web interface for exploring intertextual parallels in Ancient Greek, Latin, and English.

Contact us

Is there a resource missing? Do you have suggestions or questions regarding material for digital literary studies?

Please contact us on tpeura@cc.au.dk with your suggestions and questions. We would love to hear from you!

Revised 24.11.2025

­­­Resources presented in this website