Skip to main content
Click logo to go to Libraries homepage

Data Management Best Practices: Storage & Backups

Have you ever lost all of your digital files? Your photos, perhaps, or your music? These are horrible experiences, but losing all of your research could potentially be much worse. Let's not test the theory by making sure you don't lose your data.

A common best practice for backing up and storing your data is the 3-2-1 Rule which says you should keep

3 copies of your data on

2 types of storage media and

1 copy should be offsite

Having 1 copy offsite protects your data from local risks like theft, lab fires, flooding, or natural disasters.

Using 2 storage media improves the likelihood that at least one version will be readable in the future should one media type become obsolete or degrade unexpectedly. 

Having 3 copies helps ensure that your data will exist somewhere without being overly redundant. 

Storage Options

While working on your data you'll likely be using and saving your files on your desktop computer or laptop. Make sure to save often but also keep master copies in another location in case your computer crashes, is stolen, or falls victim to other unfortunate events.

Networked drives are a good place for one copy of your data. They're managed by your school, department, or the university so they're generally quite stable.

Talk to your School or College about the storage available on your networked drives.  Consider asking:

  • how often networked drives are backed up 
  • how to recover data if there's loss
  • what the security is like
  • how much space you have, and
  • if they have any policies about deleting files from the drive 

Some offer automatic backup services that backup daily or at other intervals but these services may not be designed to save your research data. SAS, for example, explicitly says this is not their backup services purpose and advises users to talk to their local support provider.

External hard drives are convenient places to keep a backup copy of your data. If you're working with sensitive data, you can even get encrypted external hard drives for added security.

It's best not to keep your external hard drive right next to your computer or other copies of your data. If there's a fire, flood, burglary, or other misfortune in the lab your external hard drive will face the same fate as your computer if they are co-located. 

It's also a good habit to label your external hard drives and keep a record somewhere of which hard drives have what data on them.

Note that an external hard drive is not an archive for permanently storing your data. The hard drive will eventually break down. Migrate data to newer media every 3-5 years.


Some articles on caring for you external hard drive:

CDs and DVDs are probably the most common optical storage options (although there are many fascinating others). 

Like external hard drives, CDs and DVDs should not be kept adjacent to other copies of your data. They should also not be considered long term storage and data should be migrated to newer media every 3-5 years.


Some information about caring for CD and DVD storage:

Storing your data "in The Cloud" is an easy way to meet the "1 copy offsite" piece of the 3-2-1 Rule. Cloud storage is also nice because you can often sync your files from your computer, making backing up a breeze. However, most cloud storage solutions are owned by private companies, so it's important to remember to be aware that (1) your data may not be private as the company probably has the right to look at it and might have the right to do what it pleases with that data and (2) that company may go out of business or otherwise become obsolete. 

A note about syncing: While it's very handy to have your files automatically synced onto a cloud server, make sure the files on your computer are not automatically overwriting what's in the cloud. This video from Explaining Computers gives some terrifying reasons why this is important.

Full- or part-time faculty, students, and staff at Penn have access to a Penn+Box account with 50 free GB of storage. Penn+Box has been reviewed and vetted by Penn's ISC Information Security and has been approved for storing confidential data, FERPA information, and, with IRB approval, human subject research data. More storage is available for purchase through your local support provider. The FAQs for Penn+Box have great information. Note that, per the terms of service, "Educational Institution will have the right to access Your Data in accordance with the Institutional Policies" - meaning the university can look at what you store in Penn+Box. 

Loading

A Note about Flash Drives

Flash drives are very convenient places to store data. However, flash drives, like all storage media, degrade over time. They are also very small and easily lost or broken. For this second reason especially, it is not recommended that one of your 3 copies of your data be stored on a flash drive.

Backing up your data is a bit like flossing: You know you should do it but it's hard to start doing it consistently. Once you get into the habit, though, it will come naturally to you. Have a schedule for backing up your data and decide who's responsible for doing it. Will you back it up at the end of everyday? At lunch? Will you do it? Will your RAs? 

You can (and probably should) have different plans for during your project while you are still collecting and analyzing your data and after you project has completed. It helps to have these plans written and available for anyone in your lab or research group to read.

Thinking Long Term

All media types break down over time. Best practice is to migrate your files to new storage (and, if appropriate, new file types*) every 3-5 years.

 

*this may be complex. Please ask us for pointers if you've never done this before.

Also search Research DataQ for answers