Skip to Main Content
The University of Massachusetts Amherst

Managing Your Data

Take care of the products of your research -- the tips here will help your work be available long into the future!

Prepare for long-term storage

Prepare your data for future use and long-term storage. 

A little investigating at the beginning of your project can save you time and frustration at the end. Using stable file formats, researching if your repository requires a fee or has requirements for deposit, and sharing well documented datasets can all prepare you for successful long-term storage -- and future use -- of your data. 

Furthermore, not everything needs to be shared or preserved! Iterative versions of your data or code -- while valuable to your process -- don’t necessarily need to be preserved. If a final, clean version of your data or code accurately represents information in your publication, that is what you should focus on sharing. Your final versions should be shared in stable file formats with an associated readme file.

Questions to think about

Questions to think about: 

  • What file formats will you use to store and share your data?
  • How long will your data be kept in your selected repository?
  • How long will your data be kept in your lab?
  • Are there any additional resources or requirements to prepare your data for deposit?
  • Are there any charges you need to plan for in order to deposit your data?
  • Have you consulted a repository when writing your data management plan?
  • What data will be of use to others in the future?
  • Determine what data needs to be shared:
    • Is it observational data that cannot be reproduced?
    • Is it difficult to reproduce? Did your analyses require a lot of supercomputing time, or are very specialized instruments needed for their creation? 
    • Do they underlie a publication?
    • Are they null results that tell you something about a process or a procedure?
  • What documentation will be necessary for you or others to understand your data in the future?
    • Think about how your folders are organized, how your files are organized, and what is contained within each folder.
  • Consider the value of your data:
    • Observational data that cannot be reproduced may need to be stored into perpetuity.
    • Simulations may only require the source code, initial conditions, and verification data
      • However, if simulations are time- or resource-intensive to produce, the models or results may need to be stored into perpetuity. 

Further reading