Skip to Main Content
Go to Penn Libraries homepage   Go to Guides homepage
Banner: RDDS; Research Data & Digital Scholarship displayed between 3D mesh surfaces

Text Analysis

A guide to text mining tools and methods

What Is Text Analysis?

Computational Text Analysis, Computer-aided Text Analysis, Text Mining, and the abbreviation TDM are broad terms for searching, organizing, and analyzing large amounts of text data.

Why use TDM techniques?

A Venn diagram of the intersection of text mining and six related fields (shown as ovals), such as data mining, statistics, and computational linguistics. The seven text mining practice areas exist at the major intersections of text mining with its six related fields.TDM can help reveal new patterns or information from a large body of work - leading to the development of new knowledge, of a larger evidence-based practice. TDM enables researchers to analyze thousands of documents and terabytes of data, allowing for a comprehensive look into research questions.

The methods used to process corpora vary widely between disciplines, and are based on insights from machine learning, statistics, computational linguistics, sociology, and many other fields. 

Examples where Researchers used text analysis to answer their research question

Where to Start?

How to Use this LibGuide:

Collecting data Sources of Text Data
Analyzing data you already have with No Coding / Programming experience required Software for Text Analysis
Collecting data from ProQuest, JSTOR, or LexisNexis and analyzing with given Python or R scripts Text and Data Mining Platforms
Conducting exploratory analysis of various Text Analysis and Natural Language Processing techniques

Text Analysis Using Python or 

Text Analysis Using R

Contact Us

3 speech bubbles  with ? in them, yellow red and blueQuestions about text and data mining? Contact the Research Data and Digital Scholarship team at LibraryRDDS@pobox.upenn.edu.

Subjects: Text Mining, Text AnalysisNatural Language ProcessingPythonR

Licensing

Except where otherwise indicated, original content in this guide is licensed under a  Creative Commons Attribution (CC BY) 4.0 license. You are free to share, adopt, or adapt the materials. We encourage broad adoption of these materials for teaching and other professional development purposes, and invite you to customize them for your own needs.

Creative Commons License

Penn Libraries Home Search the Catalog
(215) 898-7555