Computational Text Analysis, Computer-aided Text Analysis, Text Mining, and the abbreviation TDM are broad terms for searching, organizing, and analyzing large amounts of text data.
TDM can help reveal new patterns or information from a large body of work - leading to the development of new knowledge, of a larger evidence-based practice. TDM enables researchers to analyze thousands of documents and terabytes of data, allowing for a comprehensive look into research questions.
The methods used to process corpora vary widely between disciplines, and are based on insights from machine learning, statistics, computational linguistics, sociology, and many other fields.
Examples where Researchers used text analysis to answer their research question
How to Use this LibGuide:
Collecting data | Sources of Text Data |
Analyzing data you already have with No Coding / Programming experience required | Software for Text Analysis |
Collecting data from ProQuest, JSTOR, or LexisNexis and analyzing with given Python or R scripts | Text and Data Mining Platforms |
Conducting exploratory analysis of various Text Analysis and Natural Language Processing techniques |
Please check the Qualitative Data Analysis Libguide for Qualitative research tools and techniques, including NVivo, Atlas.ti, etc.
Questions about text and data mining? Contact the Research Data and Digital Scholarship team at LibraryRDDS@pobox.upenn.edu.
Subjects: Text Mining, Text Analysis, Natural Language Processing, Python, R
Except where otherwise indicated, original content in this guide is licensed under a Creative Commons Attribution (CC BY) 4.0 license. You are free to share, adopt, or adapt the materials. We encourage broad adoption of these materials for teaching and other professional development purposes, and invite you to customize them for your own needs.