Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Data Management at Princeton

An overview of best practices for managing research data

Data Citation

Citing Data

Citing research data in a manner similar to traditional scholarly works can help ensure proper attribution, improve reproducibility, improve discoverability, and help provide credit for data as a scholarly output. According to Force11's examples of Joint Declaration of Data Citation Principles and the Digital Curation Centre's How to Cite Datasets and Link to Publications Data should be cited as follows:

  • Include an in text citation near the claims relying on the data in the form of the citation style required by publisher. Additional information may also be included in the in text citation, such as portion of data set used. Force11 gives the example: [Author(s), Year, Portion or Subset of Data Used].
  • Full citations should be included in the reference list, following the format of the required citation style. Ball and Duke provide a comprehensive list of data citation elements.  If no format exists, Force11's examples recommend: Author(s), Year, Dataset Title, Data Repository or Archive, Version, Global Persistent Identifier.
  • Give permanent identifiers, such as DOIs or ARKs, in the form of a linked URL if possible.
  • Cite data sets at the most detailed level possible and provide version if appropriate.
  • When citing a dataset, notify the repository so a link can be added to your paper if possible.

Further Resources

Permanent Unique Identifiers

Persistent Identifiers, also called Permanent Identifiers, provide a way to provide a permanent link to a dataset or other digital object regardless of hardware or domain changes a repository may make over time. Persistent Identifiers are generally provided when data is deposited in to a repository. Several types of persistent identifiers are currently in use, including Handles (HDL), Archival Resource Keys (ARKs), Persistent URLs (PURLs), and Digital Object Identifiers (DOI). Most researchers are familiar with DOIs as this is the system used for most electronic journal articles. For more information on repositories, visit the section on Preservation.

DataSpace at Princeton is implemented to provide ARKs for datasets.

For more information on Persistent Identifiers, visit the California Digital Library's webpage on Understanding Identifiers.