There are long lists of metadata standards (an agreed upon way of developing and using metadata for a field) for many disciplines and each of those schemes have lots of documentation that someone expects you're going to read. We saved you some time by illustrating some of the most commonly used ones in their fields.
Discipline | Standard |
---|---|
General Research Data | Dublin Core |
Social Sciences | Data Documentation Initiative (DDI) |
Medical Data |
NIH Common Data Elements (CDE) Clinical Trials.gov Protocol Registration Data Elements Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) HUPO Proteomics Standards Initiative (PSI) |
Ecology, Geosciences, & Biology |
Ecological Metadata Language (EML) Minimum Information for Biological and Biomedical Investigations (MIBBI) |
Material Science | Crystallographic Information Framework (CIF) |
Geographic |
Content Standard for Digital Geospatial Metadata (CSDGM), Vers. 2 (FGDC-STD-001-1998) |
This table inspired by Oregon State University Libraries' guide on Metadata/Documentation
Unless you work with metadata regularly, the different definitions of these ideas are confusing. Here are some basic definitions to clarify the terminology. Definitions taken from: Pomerantz, J. (2015). Metadata. MIT Press.
Metadata: "metadata is a statement about a potentially informative object" (Pomerantz, 26). This is a bit more of a nuanced definition than the commonly used definition "data about data".
Metadata Schema: "a set of rules according to which a language operates" (Pomerantz, 30). Example: Dublin Core metadata schema
Syntax Encoding Schema: a set of rules that dictate how to represent, or encode, a specific type of data at the individual level (Pomerantz). This changes how you structure the data even if the data itself isn't changing. Example: ISO 8601 - standard for encoding dates in a standard way.
Controlled Vocabulary: a set of rules that dictate how to represent a specific type of data, at the individual metadata level (Pomerantz). While syntax encoding cares about how a string is formatted (but not what is in the string), a controlled vocabulary provides a finite set of strings to be used. Example: RxNorm - a terminology used to normalize names for clinical drugs and link its names to common drug vocabularies in use.
Ontology: Information science defines ontology as "a formal representation of the universe of things that exist in a specific domain" (Pomerantz, 46). In the context of research data, we use the information science definition of ontology and not the philosophical definition. Example: Oral Health and Disease Ontology - use for representing the diagnosis and treatment of dental maladies.
Here are some great resources for learning about metadata and standards