To facilitate the navigation through resources presented over the topics, we've gathered here lists of the most important resources, grouped by theme.
Annotation Studio, a collection of web-based and collaborative annotation tools.
Antconc A free software for corpus linguistic analysis, including collocation and concordance analysis.
Brat, an online environment for collaborative text annotation.
CATMA,a computer-assisted text annotation and analysis tool.
PRISM, a tool for crowdsourced interpretation of text created by Scholarslab.
the UAM CorpusTool, an open-source text corpora annotation tool.
GoodReads Datasets, collected from GoodReads in 2017.
HathiTrust, a collection of millions of titles digitized from libraries around the world.
Internet Archive, a non-profit digital library, having a collection of over 20,000,000 books and texts freely available.
NovelTM dataset for English-Language Fiction, 1700-2009.
Post45 Data Collective, peer reviewed post-1945 literary data on an open-access website designed, hosted, and maintained by Emory University’s Center for Digital Scholarship.
Apache OpenNLP, a machine learning based toolkit for text processing.
BookNLP by David Bamman, a natural language processing pipeline for longer texts.
Dariah, A Library for Topic Modelling and Visualization.
Lancaster Stats Tools online, materials and tools for a statistical corpus analysis.
MALLET, a Java-based package for statistical language processing.
Stanford CoreNLP, a natural language processing pipeline created by the Stanford NLP group.
SpaCy, an open-source library for natural language processing in Python.
TAPoR 3.0, a curated list of research tools for studying and analyzing texts.
Digital Humanities - A Primer, an online handbook for digital humanities.
Quantitative Intertextuality, supplementary material and R exercises for the book.
HathiTrust Research Center Analytics, a platform to facilitate research and large-scale computational analysis of the works in the HathiTrust Digital Library.
Computational Stylistics Group is a cross-disciplinary research group focused on computational text analyses, particularly stylometry.
Digital Humanities Lab, Yale University.
Group for Experimental Methods in Humanistic Research, Columbia University
The Art of Literary Text Analysis, a Jupyter Notebook series by Stefan Sinclair.
Humanities Data Analysis: Case Studies with Python, an online guidebook to digital humanities research in Python by Folgert Karsdorp, Mike Kestemont and Allen Riddell.
DH Tools for Beginners, a collection of tutorials written for digital humanities novices.
Programming Historian, beginner-friendly and peer-reviewed tutorials for digital tools in the field of Humanities.
Research on characterization by Ted Underwood.
Dramavis, a tool for visualizing and calculating literary network data by Frank Fischer.
From Data to Viz, a guide for producing the right kind of graph from your data.
Gephi: A free software for graph and network visualization and analysis.
TRACER machine, a powerful and flexible suite of some 700 algorithms for the automatic detection of (historical) text reuse.
Is there a resource missing? Do you have suggestions or questions regarding material for digital literary studies?
Please contact us on tpeura@cc.au.dk with your suggestions and questions. We would love to hear from you!