Skip to main content

NSF Data Management Plan Help: NSF DMP Word document template

Introduction to DMPs

"Data management plans (DMPs) are a useful way of ensuring that research data outputs are properly prepared for preservation and re-use... Were researchers asked to follow this guidance in full, there is a danger they may regard DMPs as nothing more than added bureaucracy, instead of valuable tools for producing re-usable data."

 Alex Ball (2010)

NSF DMP Guiding Questions PDFs

NSF Data Management Template

Data Management Plan Template: NSF has stressed that the content of a successful Data Management Plan will be determined by peer review and standards that represent best practices for a discipline.  However, they have suggested broad categories of content. You can use the attached templates to create your data management plan.

These documents are based on a template created by the Division of Research Development and Administration at the University of Michigan and have been modified with permission for use at Princeton University. The first template is for general use. The second template includes sample text for researchers who will be using Princeton's DataSpace.

For questions about these templates contact, Willow Dressel or Anne Langley.

For questions about DataSpace or the DataSpace Template, please contact
Mark Ratliff, Digital Repository Architect, phone: (609) 258-0228 or Serge Goldstein.

NSF DMP Guiding Questions

We invite you to use the questions below to guide your thoughts as you write a Data Management Plan for an NSF Grant. These questions were modified from the Digital Curation Center's Data Management Plan Template to fit the NSF requirements. A PDF version of this document is available in the box on the left.

1. The types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project.

1.1 What data types will you be creating or capturing?

1.2. What other types of software, physical collections, samples, curriculum materials, etc. will you be creating or capturing?

2. The standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies).

2.1. Which file formats will you use and why? List format(s) of the data, e.g. FITS, SPSS, HTML, JPEG, and any software required to read the data. For examples of acceptable data formats see the Data Formats Table from the UK Data Archive.

2.2. Metadata: Which metadata standards will you use? Metadata are a subset of core data documentation, which provides standardized structured information explaining the purpose, origin, time references, geographic location, creator, access conditions and terms of use of a data collection. Researchers can choose among various metadata standards, often tailored to a particular file format or discipline.  One such standard is DDI (the Data Documentation Initiative), designed to document numeric data files. Other examples of metadata standards are Dublin Core Metadata Initiative, Metadata Encoding and Transmission Standard (METS), ISO 19115 for Geographic Information, Qualitative Data Exchange Format QuDEx v. 3, Statistical Data and Metadata eXchange SDMX.

2.2.1. What contextual details are needed to make the data you capture or collect meaningful? Documenting this ensures that data can be understood during research projects, that researchers continue to understand data in the longer term and that re-users of data are able to interpret the data.

2.2.2. How will you create or capture these metadata?

2.2.3. What form will the metadata take?

2.2.4. To what extent will metadata creation be automated?

2.3. If no metadata standards exist or they are inadequate, how will you address this?

3. Policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements.

3.1. Access, data sharing:

3.1.1. Will you share the data you capture or create?

3.1.2. Will any permission restrictions need to be placed on the data?

3.2. Ethical, privacy, and intellectual property issues:

3.2.1. Are there ethical and privacy issues?

3.2.2. If so, how will these be resolved?[1]

3.2.3. Is the data ‘personal data’ in terms of HIPAA[2]?

3.2.4. What have you done to comply with your obligations under HIPAA?

3.2.5. Is the dataset covered by copyright or license? If so, who owns the copyright and other intellectual property?[3]

3.3. Security:

3.3.1. How will you manage access arrangements and data security?

3.3.2. How will you enforce permissions, restrictions, and embargos?

3.3.3. Other security issues?[4]

4. Policies and provisions for re-use, re-distribution, and the production of derivatives.

4.1. Which bodies/groups are likely to be interested in the data?

4.2. Are there any reasons not to share or re-use the data?[5]

4.3. If the data is shared and re-use is allowed, how and when will you make it available?

4.4. How will the dataset be licensed if rights exist?[6]

4.5. Do you plan to publish findings which rely on the data?

5. Plans for archiving data, samples, and other research products, and for preservation of access to them.

5.1. Which archive/repository/central database/data center have you identified as a place to deposit data?  (see Princeton's DataSpace option)

 5.2. What is the long-term strategy for maintaining, curating, and archiving the data?  (see Princeton's DataSpace option)

 5.3. What procedures does your intended long-term data storage facility have in place for preservation and backup? [7]   (see Princeton's DataSpace option)


 

[1] E.g. anonymization of data, institutional ethical committees, formal consent agreements.

[2] The Health Insurance and Portability and Accountability Act (HIPAA) “protects the privacy of individually identifiable health information.” http://www.hhs.gov/ocr/privacy/

[3] Ideally, this should address the risk of movement of staff between institutions mid-project.

[4] Should address (where relevant) sensitive data, off-network storage, storage on mobile devices, (laptops, smartphones, flash drives, etc.) policy on making copies of data, etc.

[5] E.g. ethical, non-disclosure, quality-related etc.

[6] E.g. any restrictions or delays on data sharing needed to protect intellectual property, copyright or patentable data.

[7] How regular, by whom, methods used (e.g. format normalization, migration…)

Additional Questions

For each of the five NSF Data Management Plan sections, these are additional questions you may want to consider as you create your plan and manage your data. These questions were modified from the Digital Curation Center's Data Management Plan Template to fit the NSF requirements. A PDF version of this document is available in the box on the left.

 1. Data types

§  How will you capture or create the data?

o   Existing and new data:

§  Have you surveyed existing data, in your own institution and from third parties?

§  What existing datasets could you use or build upon?

§  Are there any access issues?

§  What ‘added value’ will the new data you create or capture provide to existing datasets?

§  Why do you need to capture or create new data?

§  What is the relationship between new dataset(s) and existing data?

§  How will you manage integration between the data being gathered in the project and preexisting data sources?

o   Anticipated quantity of data or other materials.

2. Standards

o   Glossary of terms describing data, equipment, processes, etc.

3. Access and sharing policies

 o    If so, do your prospective publishers place any restrictions on other avenues of publication?

4. Re-use and re-distribution policies

o   If so, do your prospective publishers place any restrictions on other avenues of publication?

5. Archiving plans

o   On what basis will data be selected for preservation?

o   How long will (or should) data be kept beyond the life of the project?[1]

o   How will you dispose of/transfer sensitive data?[2]

o   Appraisal and retention timeframes (ideally with definite figures)[3]

o   What transformations will be necessary to prepare data for preservation/data sharing?[4]

o   What related (representation)information will be deposited?[5]

o   What metadata/documentation will be created at each stage of deposit/transformation?[6]

o   How will this be created and by whom?

o   Will you include links to published materials and/or outcomes?

o   How will you address the issue of persistent citation?

o   Anticipated data volumes.[7]

o   Storage:

§  Where (physically) will you store the data?

§  On what media will you store the data?

§  Whose responsibility is the storage of data?

o   Back-up:

§  How will you back-up the data?[8]

§  How regularly will back-ups be made?

§  Whose responsibility will this be?



[1] N.B. this may simply link to relevant institutional or funding body requirements/polices: political, temporal, commercial, legal.

[2] Include justification of decisions.

[3] N.B. this may simply link to relevant institutional or funding boyd requirements/policies: political, temporal, commercial, legal.

[4] E.g. data cleaning/anonymization where appropriate.

[5] E. G. references, reports, research papers, fonts, the original bid proposal, etc.

[6] E.g. descriptive, structural, administrative, preservation etc.

[7] Ballpark figures, orders of magnitude.

[8] Should address off-site storage.