Skip to Main Content
Go to Penn Libraries homepage   Go to Guides homepage
Banner: RDDS; Research Data & Digital Scholarship displayed between 3D mesh surfaces

Data Management Resources

Why Cite Data?

Datasets used during the research process should be cited like you would cite an article - in the reference, cited sources, and bibliographies sections of your works. The process of citing research data has developed as the researchers and stakeholders realize that the inclusion of data is necessary for a complete scholarly record between a research product and the evidence it is based on.

Citing data:

  • attributes credit to the responsible researchers
  • allows those sharing the data to measure its impact
  • supports the research infrastructure by connecting data and published research 
  • improves access to data
  • provides opportunities for verifying data and enable reuse
  • promotes data as an equal scholarly output to a written work

How To Cite Data

While citing data has become an expectation, scholarly communities and communities of practice have largely struggled to develop data citation standards within their existing citation styles. This leaves the burden on research data communities to create a data citation format that conforms to existing styles rules as best as possible. When citing a dataset in a paper, follow the citation style required by the publisher. If they do not have a format for datasets, collect all the core elements and match the citation for textual publications. You can also follow DataCite's citation style for a dataset and adapt it to match the citation style you are using. 

Core Elements:

  • author/creator
  • date of publication
  • title, including version or edition
  • publisher or distributor (such as the name of the repository where the data was found)
  • URL, DOI or other persistent identifier 

Example Citations from IASSIST's Quick Guide to Data Citation

APA (6th edition)

Smith, T.W., Marsden, P.V., & Hout, M. (2011). General social survey, 1972-2010 cumulative file (ICPSR31521-v1) [data file and codebook]. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor]. doi: 10.3886/ICPSR31521.v1

MLA (7th edition)

Smith, Tom W., Peter V. Marsden, and Michael Hout. General Social Survey, 1972-2010 Cumulative File. ICPSR31521-v1. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011. Web. 23 Jan 2012. doi:10.3886/ICPSR31521.v1

Chicago (16th edition) (author-date)

Smith, Tom W., Peter V. Marsden, and Michael Hout. 2011. General Social Survey, 1972-2010 Cumulative File. ICPSR31521-v1. Chicago, IL: National Opinion Research Center. Distributed by Ann Arbor, MI: Inter-university Consortium for Political and Social Research. doi:10.3886/ICPSR31521.v1

Research Data Engineer

Profile Photo
Lauren Phegley
she/her

Lauren Phegley holds consultations on data management, DMPTool, writing Data Management Plans (DMPs), and data sharing.

Head of Research Data Services

Profile Photo
Lynda Kellam
she/her

Director of Research Data & Digital Scholarship

See schedule button for current dates and times. Appointments available in person and on zoom.

Subjects: Data & GIS
Penn Libraries Home Search the Catalog
(215) 898-7555