Tutorials and Guides
Data Services at Penn Libraries
Penn Libraries provides access to data sets across disciplines. A librarian can provide one-on-one assistance locating and retrieving quantitative or spatial data sets from specialized resources:
|Library computer labs offer software for making meaning out of data, including statistical packages (SPSS, SAS, Stata, R) and geographic information systems (ArcGIS, QGIS). Contact a librarian for consultations on:
Between finding and analyzing data is the responsible use of data. As well as preparing the data for analysis, researchers must understand the context of data and properly credit the source of the data. A librarian can help with:
- Understanding codebooks, data dictionaries, metadata, and file formats
- Extracting and reformatting usable subsets of data
- Data citation guidance
Visualizations, top to bottom: a) Cartogram representing size of states according to population in 2008, with states colored according to electoral votes in 2008 election, made with ScapeToad and ArcGIS Desktop. Data sources: b) Network diagram showing where Pennsylvania residents have moved by state, 2009-2010, made with Google Fusion Tables. Data source: U.S. Population Migration Data: State-to-State Migration, 2009-2010 [Computer file]. Internal Revenue Service, downloaded from http://www.irs.gov/taxstats/article/0,,id=212702,00.html, August 21, 2012.
What is [are] data?
Data are a kind of information structured for analysis.
Quantitative or numeric data are suitable for statistical analysis. Spatial data are used with geographic information systems (GIS), often to make maps.
Statistics result from analyzing and summarizing the data, for example, a mean or frequency.
Microdata, or raw data, have yet to be analyzed or aggregated; it is organized according to the unit of observation of the study. Microdata must be processed in order to be intelligible. Aggregated data have already been processed to some extent, often showing counts or percentages by larger units.
The format of the data ranges from easy-to-use files like Excel spreadsheets, to specialized formats for statistical software.