Dataset Documentation Requirements
The documentation (i.e., the "Readme" file) that accompanies each project data set is as important as the data itself. This information permits collaborators and other analysts to understand any limitations or special characteristics of the data that may impact its use. Complete documentation helps to ensure long term use of the data. Data set documentation must accompany all data set submissions, including both preliminary and final. The following outline and content should be adhered to as closely as possible to make the documentation.
Dataset Documentation/Readme Outline
Title: This should match the data set name.
Author(s)
- Name(s) of the lead author and any co-authors. Indicate lead and corresponding authors
- For all authors, provide an ORCiD if available, email address and institution/organization name
- For lead and corresponding authors, provide complete mailing address, telephone numbers, title or position, and website address, if applicable
1.0 Data Set Description
- Introduction or abstract
- Data version number and date
- Data Status (Preliminary or Final)
- Time period covered by the data
- Physical location (including lat/lon/elev) of the measurement or platform
- Data Frequency - Frequency of data collection (e.g., 5 minute, hourly, continuous, etc.).
- Data source (e.g., for operational data include agency), if applicable
- Web address references (e.g., project web site, etc.), if applicable
- Data set restrictions (i.e., indicate if data set needs to be restricted, requires password protection, contains personal info, description of any licensing, etc.)
2.0 Instrument Description
- Brief text describing the instrument and how it collects data including references
- Figures (or links), if applicable
- Table of specifications (i.e., accuracy, precision, frequency, resolution, etc.)
3.0 Data Collection and Processing
- Description of data collection
- Description of derived parameters and processing techniques used
- Description of quality assurance and control procedures
- Data intercomparisons, if applicable
4.0 Data Format
- Data file structure and file naming conventions (e.g., column delimited ASCII, NetCDF, GIF, JPEG, etc.)
- Data format and layout (i.e., description of header/data records, sample records)
- List of parameters with units, sampling intervals, frequency, range
- Description of flags, codes used in the data and definitions (i.e., good, questionable, missing, estimated, etc.)
5.0 Data Remarks
- PI's assessment of the data (i.e., disclaimers, instrument problems, quality issues, etc.)
- Missing data periods, if applicable
- Software compatibility (i.e., list of existing software to view/manipulate the data plus software repository locations/links and responsible parties' contact information)
6.0 References
- List of publications and documents (e.g., conference proceedings, publications, theses, reports, etc.) cited in this data set description and/or using this data set. Provide links, if available.
7.0 Appendix
- Suggest GCMD science keywords to describe dataset. A tool that may be helpful is the GCMD Science Keyword Viewer.