Data Management Plan

Many federal agencies, including the National Institutes of Health (NIH) and most recently the National Science Foundation (NSF), are requiring that grant applications contain data management plans for projects involving data collection. Beginning January 18, 2011, proposals submitted to NSF must include a supplementary document of no more than two pages labeled “Data Management Plan” (DMP). This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. According to the NSF Grant Proposal Guide, the DMP will now be reviewed as an integral part of the proposal. Proposals that do not include a DMP will not be able to be submitted.

Elements of a Good Data Management Plan include

Brief, high-level description of the information to be gathered; the nature, scope and scale of the data that will be generated or collected.
Formats in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats.
Indicate how you intend to archive and share your data and why you have chosen that particular option. This should include a description and rationale for any restrictions on who may access the data under what conditions and a timeline for providing access. This should also include a description of the resources and capabilities (equipment, connections, systems, expertise, repositories, etc.) needed to meet anticipated requests. These resources and capabilities should be appropriate for the projected usage, addressing any special requirements such as those associate with streaming video or audio, movement of massive data sets, etc.
Statement of plans for metadata content and format, including description of documentation plans and rationale for selection of appropriate standards. Existing, accepted standards should be used where possible. Where standards are missing or inadequate, alternate strategies for enabling data re-use and re-purposing should be described..
Statement of plans, where appropriate and necessary, for protection of privacy, confidentiality, security, intellectual property and other rights.
A description of technical and procedural protections for information, including confidential information, and how permissions, restrictions, and embargoes will be enforced.
A description of how data will be selected for archiving, how long the data will be held, and plans for eventual or termination of the data collection in the future.
Description of plans for preserving data in accessible form. Plans should include a timeline proposing how long the data are to preserved, outlining any changes in access anticipated during the preservation timeline, and documenting the resources and capabilities (e.g., equipment, connections, systems, expertise) needed to meet the preservation goals. Where data will be preserved beyond the duration of direct project funding, a description of other funding sources of institutional commitments necessary to achieve the long-term preservation and access goals should be provided.
Storage methods and backup procedures for the data, including the physical and cyber resources and facilities that will be used for the effective preservation and storage of the research data.
Names of the individuals responsible for data management in the research project. *This particularly important when working with multiple PIs and/or collaborative partners.
The costs of preparing data and documentation for archiving and how these costs will be paid. Requests for funding may be included, depending on the agency (i.e., NSF guidance)

Marshall University Data Management Information

Marshall University provides a Central Data Center (MU Datacenter) on its main campus in Huntington, WV in support of administrative, instructional, and research computing. This data center is powered by a power distribution system with UPS and generator facilities for continuous operation. The data center is cooled with a redundant and independent cooling system. Physical security is provided by card access control and video security monitoring as well as individual locked cabinets to secure host servers and storage for independent projects.

The Data Center hosts switched gigabit and ten gigabit server connections as part of a dedicated network secured from the campus network with Cisco firewalls. Data transfers can be secured by VPN, SSL, and SSH.

The MUNet campus network has over 11,000 switched gigabit network connections and a 10Gb backbone. The MUnet campus networks are connected via 10Gb dual diverse path connections through our Internet Service Provider (ISP), the Ohio Academic and Research Network (OARnet). Marshall University is also a member of Internet2 and is connected to Internet2 with 7Gb of service. A total of 3Gb commodity Internet Service is currently being provided to MUnet subscribers. This bandwidth and redundancy provide the reliability and services needed to support current campus initiatives.

The Data Center currently has a single HPC Cluster with over 1Tflop of compute services and extends these services through the use of other Internet2 connected resources such as the TeraGrid. Storage is provided by Dell/EMC all-SSD storage area network (SAN).  Backup services are using remote site disk-to-disk backup.

A research portal for data sharing and collaboration is currently in the pilot phase and is being based on HUBZero. Storage and compute services are available on a cost-recovery basis to support research projects based on a published IT Rate Schedule. The university Information Technology Council provides a link to the university — IT Policies (privacy, confidentiality, security, intellectual property rights, copyright, etc.).

Example Data Management Plans

NSF Data Management Plan Templates and Examples

When preparing your Data Management Plan (DMP) for your NSF grant application, you can follow these steps:


Examples:


More Templates:

  • Directorate specific templates for NSF data management plans from the University of Virginia Library Scientific Data Consulting Group. These are very useful, but remember these are tailored to the UVa community.
  • Integrated Earth Data Applications (IEDA) Data Management Tool is an online form you can fill out to help generate your data management plan. The form is for the earth sciences.
  • Data Conservancy recognizes the need for institutional and community solutions to digital research data collection, curation and preservation challenges. DC tools and services incentivize scientists and researchers to participate in these data curation efforts by adding value to existing data and allowing the full potential of data integration and discovery to be realized.