You may have heard people tell you to create metadata to go along with your data. You may have blocked this out. The reason for this recommendation is so that your data will be understandable and usable in the future - either for you and your lab members or for a wider audience should you share your data outside the lab. There are many ways to document your data beyond using metadata, though, and more information on all of them are here. If you have questions please ask!
Keep a file with information about your project in the same folder as your other files. A rule of thumb is to write as much information as necessary to understand your data.
Project Level
File Level
Technical Description
Access
There are long lists of metadata schema for many disciplines and each of those schemes have lots and lots of documentation that someone expects you're going to read. My guess is you don't want to read that documentation. Please ask for help if you need to write some serious metadata and are overhelmed.
Discipline | Standard | Tools |
---|---|---|
General Research Data | Dublin Core | Dublin Core Generator |
Social Sciences | Data Documentation Initiative (DDI) |
- see all DDI tools |
Ecology, Geosciences, & Biology | Ecological Metadata Language (EML) - note DarwinCore is also common but no tool seems to exist to create metadata in this standard |
- see all EML tools |
Geographic | ISO 19115-2014 | - Federal Geographic Data Committee (FGDC) tools for metadata creation |
This table inspired by Oregon State University Libraries' guide on Metadata/Documentation
ReadMe files should be used to describe your project and your data. When depositing data into repositories, you'll likely include a ReadMe file that just explains the files you've deposited. When you're keeping ReadMe files for your own records, it's good to have a top-folder ReadMe that explains all the subfolders and files that are part of the project as well as having them for lower-level files.
These two resources give great overviews of ReadMe files and guidance on how to create them:
Here's some guidance from two popular repositories that recommend and use ReadMe file:
Codebooks are documents that explain the variables in your dataset. ICPSR suggests that these documents should note:
See Also:
Data Dictionaries are very similar to (and arguably the same as) codebooks. DataQ has a great entry on data dictionaries written by Yasmeen Shorish:
This video from Kristin Briney is one of the most descriptive, yet concise, resources explaining data dictionaries: https://www.youtube.com/watch?v=Fe3i9qyqPjo . The video details what should go into the dictionary (variable or field names, units, relationships to other variables, data types, what people need to make sense of a researcher's work) and explains the reasons why a researcher might want one. There are also examples given of what a data dictionary looks like. There is also a blog post on the topic from the same author, in case you prefer text to video: http://dataabinitio.com/?p=454
For those looking at data dictionaries from a relational database perspective, this video tutorial provides stepwise instruction: https://www.youtube.com/embed/QRMUReSENjU
A robust and technical definition of a data dictionary from a LIS encyclopedia may be useful for some researchers and librarians: "Data Dictionary (Metadata Dictionary): A subsystem of a database that records the definitions (semantics) for all the metadata elements used in a database. A data dictionary may also include detailed documentation about the rellationships among metadata elements, as well as syntax and schema application rules. The term data dictionary comes from the relational database community and may be viewed as a type of metadata specification" Drake, M. A. (2003). Metadata in the World Wide Web in Encyclopedia of library and information science. 2nd ed. / New York: Marcel Dekker.