Skip to Main Content
Go to Penn Libraries homepage   Go to Guides homepage
Banner: RDDS; Research Data & Digital Scholarship displayed between 3D mesh surfaces

Data Management Resources

Metadata & Standards

There are long lists of metadata standards (an agreed upon way of developing and using metadata for a field) for many disciplines and each of those schemes have lots of documentation that someone expects you're going to read. We saved you some time by illustrating some of the most commonly used ones in their fields. 

Disciplinary Metadata Standards (schema, controlled vocabulary, or ontology)
Discipline Standard
General Research Data Dublin Core
Social Sciences Data Documentation Initiative (DDI)
Medical Data

NIH Common Data Elements (CDE)

Clinical Trials.gov Protocol Registration Data Elements

Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM)

HUPO Proteomics Standards Initiative (PSI)

Unified Medical Language System (UMLS)

National Drug Code (NDC) Directory

Ecology, Geosciences, & Biology

Ecological Metadata Language (EML) 

Darwin Core

Minimum Information for Biological and Biomedical Investigations (MIBBI)

Material Science Crystallographic Information Framework (CIF)
Geographic

ISO 19115-2014

Content Standard for Digital Geospatial Metadata (CSDGM), Vers. 2 (FGDC-STD-001-1998)

OpenGeoMetadata (OGM) Aardvark

This table inspired by Oregon State University Libraries' guide on Metadata/Documentation

Metadata Definitions

Unless you work with metadata regularly, the different definitions of these ideas are confusing. Here are some basic definitions to clarify the terminology. Definitions taken from: Pomerantz, J. (2015). Metadata. MIT Press.

Metadata: "metadata is a statement about a potentially informative object" (Pomerantz, 26). This is a bit more of a nuanced definition than the commonly used definition "data about data". 

Metadata Schema: "a set of rules according to which a language operates" (Pomerantz, 30). Example: Dublin Core metadata schema  

Syntax Encoding Schema: a set of rules that dictate how to represent, or encode, a specific type of data at the individual level (Pomerantz). This changes how you structure the data even if the data itself isn't changing.  Example: ISO 8601 - standard for encoding dates in a standard way. 

Controlled Vocabulary: a set of rules that dictate how to represent a specific type of data, at the individual metadata level (Pomerantz). While syntax encoding cares about how a string is formatted (but not what is in the string), a controlled vocabulary provides a finite set of strings to be used. Example: RxNorm - a terminology used to normalize names for clinical drugs and link its names to common drug vocabularies in use. 

Ontology: Information science defines ontology as "a formal representation of the universe of things that exist in a specific domain" (Pomerantz, 46). In the context of research data, we use the information science definition of ontology and not the philosophical definition. Example: Oral Health and Disease Ontology - use for representing the diagnosis and treatment of dental maladies. 

Locations to Find Metadata Standards

Metadata & Standards Resources

Here are some great resources for learning about metadata and standards

Penn Libraries Home Franklin Home
(215) 898-7555