What is metadata and why do we need it?
Metadata can be defined as data about data. A metadata record describes a dataset, providing enough information that someone unfamiliar with it can discover and reuse it. In order to create good, accurate metadata records, it is essential to record descriptive metadata throughout the research process. For describing polar datasets, the UK Polar Data Centre (UK PDC) requires both discovery metadata (that allows the user to find the data) and contextual or full metadata (that provides detailed information for data reuse).
To collect the necessary metadata about our datasets, the UK PDC has designed and maintains the Discovery Metadata System – a web-based, searchable metadata catalogue system. Periodically PDC metadata entries will be sent to the Antarctic Master Directory, an internationally accessible system of Antarctic metadata listings. Our metadata are also harvested by the NERC Data Catalogue Service, which lists data holdings and information products held at other NERC Environmental Data Centres.
What, where, when…
Some of the key metadata fields used and their description are as follows:
- Title: a concise and simple header describing the data, rather than the project/activity that produced it
- Abstract: a short description summarising the dataset that allows the user to determine the scope and relevance of the resource
- Contacts: the roles and responsibilities of those associated with the dataset, such as creator, contributor, funder, distributor
- Temporal coverage: a date range for the start and end of data collection
- Spatial coverage: encompassing latitude, longtitude, altitude and depth
- Format(s): data formats for dataset delivery – these should be interoperable/non-proprietary format
- Lineage: a description of how the data were collected and stages it went through before deposit, including fieldwork methods, instrumentation, data processing
- Quality: quality control or assessment applied to the data, plus limitations on reliability or accuracy
- Access constraints: any restrictions on accessing the data, e.g. if the data are under embargo and when it will be made freely available
- Use constraints: any restrictions on the use of the dataset once accessed – for NERC-funded data, the Open Government Licence is used
- Supporting documentation: contextual metadata, such as associated papers, articles or reports
- Keywords and topic categories: search terms for finding data, including predefined/controlled vocabularies