Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Data Management at Princeton

An overview of best practices for managing research data

Data Capture

Data capture should always be done in a consistent way. Develop standard operating procedures that clearly define the steps to be taken and outlines roles and responsibilities. Standard operating procedures are useful even for single person projects to ensure that there is consistency over time. In addition to information on experimental setup, standard operating procedures should indicate when to create documentation and where and how files are named.

Dataset Documentation (Metadata)

Describing and documenting data is one of the best ways to ensure they will be discoverable and useable in the future. This is often called metadata and includes the who, what, when, where, and why of how the dataset was generated.

In general there are two types of documentation that help ensure usability far in to the future, descriptive or study-level documentation and structural or data-level metadata (NISO and UK Data Archive). Metadata standards specify what information should be collected and help data in specific disciplines to be interoperable. The Research Data Alliance contains a community maintained list of Disciplinary Metadata Standards.

Documentation needs will vary by project and discipline. If there isn't an existing standard, create a template that will record all the important details of the data. At the minimum it should include (based on MIT Documentation and Metadata guidance and UK Data Archive Study Level Documentation):

  • Title
  • Creator: names and addresses of data creator(s)
  • Identifier: can be a permanent identifier or an internal project number.
  • Funder
  • Rights: Intellectual property or licensing rights for the data.
  • Access Information
  • Language
  • Dates
  • Project description
  • Methodology: how data was generated.
  • Data Structure: including relationships between files.
  • Variable names or other data level documentation (if not self evident or embedded in the file).
  • Data Citation: Preferred format for citing data.

Documentation can be included in a README.txt file (or other relevant file name) in the folder with the data files.The README.txt file should accompany the data if files are moved or deposited in a repository. Be sure the README file includes such information as file folder heirarchy or other external context of the data.