#s-lg-box-27454552-container #s-lg-col-3 h2.s-lib-box-title {display: block;} Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Image at top shows a map of demographic data for Philadelphia

Text Analysis at Penn Libraries

A guide to text mining tools and methods

Applied Data Science Librarian

Profile Photo
Jajwalya Karajgikar


This guide was created by Rachel Liu, Research Data and Digital Scholarship Text and Data Mining Assistant. Rachel is a graduate student in Education Data Mining.

What Is Text Analysis?

Computational Text Analysis, Computer-aided Text Analysis, Text Mining, and the abbreviation TDM are broad terms for searching, organizing, and analyzing large amounts of text data.

Why use TDM techniques?

A Venn diagram of the intersection of text mining and six related fields (shown as ovals), such as data mining, statistics, and computational linguistics. The seven text mining practice areas exist at the major intersections of text mining with its six related fields.TDM can help reveal new patterns or information from a large body of work - leading to the development of new knowledge, of a larger evidence-based practice. TDM enables researchers to analyze thousands of documents and terabytes of data, allowing for a comprehensive look into research questions.

The methods used to process corpora vary widely between disciplines, and are based on insights from machine learning, statistics, computational linguistics, sociology, and many other fields. 

Examples where Researchers used text analysis to answer their research question

Related Guides