This guide is intended to help students in SOCI 1040 (formerly SOCI 007) locate and use data on populations.
Related guides include:
The most recent versions of Excel, SPSS, Stata, R, and SAS are available on Penn Libraries public computers and also online through the Penn Libraries VIrtual Lab.
Online resources for using SPSS, Stata, R, and SAS are described in this Penn Libraries research guide:
Explore the data. The World Bank's World Development Indicators has data visualization tools for comparing countries and examining trends.
Start with a research topic. What clues does your research topic give about the data that you'll need?
a trend: To study a trend, you'll need data that covers a period of time.
a disparity: You'll need data measuring the same thing for different populations.
a comparison: This requires a cross-national or sub-national data set.
a pattern in: You'll need data for smaller regions within a larger geographic area.
What techniques will you use to analyze the data, and which tools can you use?
For making graphs and tables, data in Excel format might be the most useful.
Certain data sets may be available in large files for use with statistical software.
Some may have online analysis options.
Pay attention to the nature of the dataset and its interface.
Look first for formally-published data (e.g., tables in statistical abstracts and yearbooks, print or online).
If you can't find formally-published data, then look for aggregate data (data presented in tables, usually aggregated for geographic units, in databases or websites).
If you can't find aggregate data, then your last look is for microdata (individual response data). Some microdata providers offer online crosstabulation tools, but most microdata providers provide raw data only for which you'll need statistical software.
Remember the Four P's:
Producer
Program
Publisher
Product (or Publication)