4 Make Your Data Interoperable
[whole section preamble about “SEMANTIC RESOURCES” (rather than controlled vocabularies, as it was)… Possibly want to connect to framework of understanding, incl. controlled vocabulary, taxonomy, close associations, and ontology = taxonomy + relationships between terms]
Possible content/phrases to incorporate into whole section preamble: Ontology describes a set of concepts or categories in a domain and shows their properties and the relations between them.
Semantic resources (i.e. ______), improve consistency across diverse sources of data and are essential to interoperability. Controlled vocabularies, taxonomic authorities, and habitat classifications are all useful semantic resources for standardizing biological data.
Controlled vocabularies provide standardized terms and definitions for biological concepts
They facilitate searching for data in web portals. They also enable records to be interpreted by computers. This opens up data sets to a whole world of possibilities for automated data workflows, computer aided manipulation, distribution, interoperability, and long-term reuse.
By connecting terms held in controlled lists using standards, the data described by these controlled vocabularies become more interoperable and hence more broadly reusable.
enable records to be interpreted by computers, facilitating automated data workflows, computer aided manipulation, distribution, and interoperability
It becomes possible to build a truly distributed and interoperable data ecosystem across domain boundaries, enabling data reuse no matter the purpose for which they were collected in the first place.
concise, controlled description
Standardized classifications of observational data also allows the information derived from individual projects to be compared with each other and compiled into larger and more extensive representations across space and time. This advances the collective scientific knowledge of natural systems and enables coordinated management activities, such as regional assessments, monitoring and restoration between partners.
Combined with international ontologies like ENVO, which is used by UNESCO, for example, the reach of your data will become global.
4.1 Controlled Vocabularies
Controlled vocabularies are pre-determined standardized terms and definitions used to describe a specific entity, collection, parameter, or unit of measurement in either metadata or data. While they facilitate computer-readability, they also reduce ambiguity around terms. Controlled vocabularies are designed to fit specific schema and are routinely updated by the communities that use them.
4.1.1 Environmental Ontology (ENVO) [emoji]
What is it?
The Environmental Ontology (ENVO) is an ontology (i.e. it describes a set of concepts or categories, their properties, and the relations between them) of all things environmental (e.g. systems, components, and processes). ENVO aids humans, machines, and semantic web applications in understanding environmental entities of all kinds, from microscopic to intergalactic scales, increasing the interoperability of environmental descriptions.
Why?
ENVO provides a comprehensive standardized vocabulary for describing habitats, ecosystems, and environmental processes.
It incorporates the relationships between objects (Ontology).
It is used internationally.
Several portals and ontology browsing interfaces already harvest from ENVO.
Top Resources
Buttigieg et al. (2013) article introducing ENVO in Journal of Biomedical Semantics
Buttigieg et al. (2016) article revisiting ENVO
Instructions on how to Browse ENVO terms
[Do we want to include this fun fact?
ENVO terms can also be used within the MIxS metadata standard (see here), for example to describe the materials that compose your sample or the environment where the sample was collected.
Could be included in either and link to the other]
4.1.2 Natural Environmental Research Council (NERC) Vocabulary Server (NVS) [emoji]
What is it?
The Natural Environment Research Council (NERC)-funded Vocabulary Server (NVS) provides access to standardized and hierarchically-organized vocabularies, primarily in oceanographic and associated domains. NVS is managed by the British Oceanographic Data Centre at the National Oceanography Centre (NOC).
Why?
It is used by the marine science community in the UK (MEDIN), Europe (SeaDataNet), and globally, by a variety of organisations and networks.
By connecting terms held in controlled lists using standards, the data described by these controlled vocabularies become more interoperable and hence more broadly reusable.
It becomes possible to build a truly distributed and interoperable data ecosystem across domain boundaries, enabling data reuse no matter the purpose for which they were collected in the first place.
Top Resources
The NVS Search can be used to find controlled vocabulary terms
The NVS Vocab Search can search the entire NVS content
SeaDataNet Search searches only collections used by the SeaDataNet data infrastructure
4.1.3 Global Change Master Directory (GCMD) Keywords [emoji.. gonna suggest a globe]
What is it?
The Global Change Master Directory (GCMD) Keywords are a standardized set of terms used to describe Earth science data sets and services. They serve as a common language for categorizing and searching for data related to Earth science, environmental science, and global change research
Why?
GCMD Keywords provide a standardized vocabulary for describing Earth science data, ensuring consistency and interoperability across different data sets and repositories.
The GCMD keywords describe Earth science data and services consistently and comprehensively in a hierarchical format and follow a codified governance process.
The power of the keywords is in their ability to enable scientists to tag their data using a taxonomy of controlled scientific categories. This, in turn, allows those searching for data to discover datasets easily through the use of an established hierarchy.
Top Resources
- More information can be found here:Global Change Master Directory (GCMD) Keywords | Earthdata (nasa.gov)
4.3 Habitat Classification
Habitat classification is the process of organizing quantitative observations (i.e., data collected by various methods and instruments) about the natural world into meaningful, human-understandable representations and descriptions. Habitat classification standards or systems provide terminology and methodology or guidance so that classification can be performed in a standard manner by individual projects and programs.
4.3.1 Coastal and Marine Ecological Classification Standard (CMECS) [emoji]
What is it?
The Coastal and Marine Ecological Classification Standard (CMECS) is a structured dictionary of terms that describe benthic habitats in marine, estuarine, and lacustrine settings. It was developed by a consortium of scientists and coastal managers to meet the needs of inventorying, monitoring and managing natural resources in US and territorial waters. CMECS terms are organized in a spatially-scaled framework for annotating geospatial data and integrating information about the physical components of aquatic habitats that enables three-dimensional representations of ecological conditions for associating with faunal behavior.
Why?
CMECS is endorsed as a Federal Geographic Data Committee (FGDC) standard in the U.S. for aquatic environmental data so that data collectors and analysts can describe data using standard terminology and organization structures.
Using CMECS enables data discovery, data use and re-use, and broader analytical applications of data federally-funded data assets.
Top Resources
- Documentation of the CMECS standard, how to use it (including examples), and how to contribute to its improvement
4.3.2 U.S. National Vegetation Classification (NVCS) [emoji]
What is it?
The U.S National Vegetation Classification (USNVC) is the comprehensive, standardized, and hierarchical classification system for all vegetation types in the United States. Because several agencies, each with its own sampling protocols, are tasked with mapping and describing vegetation in the United States, the resultant inventories are not automatically interoperable, making vegetation resource monitoring across jurisdictional boundaries challenging. The USNVC, a collaboration between the Ecological Society of America (ESA), NatureServe, and various federal agencies, was created to address this need. It provides a common language that allows for communication and cooperation on vegetation management issues across jurisdictional boundaries for the effective management and conservation of plant communities.
Vegetation modeling and mapping are relevant to many conservation efforts, including land inventories, wildlife habitat inventories, enhancing natural resource conservation efforts, fire management, invasive species management, and setting national vegetation policies (e.g. biofuels, carbon markets, and ecosystem services).
Why?
As a dynamic standard, the USNVC is designed to be easily adapted as new ecological knowledge becomes available.
Its hierarchical nature makes classification scalable for diverse applications from vegetation monitoring to broad-scale analyses of trends across North America.
The USNVC is governed by standards for vegetation data collection and analysis, ensuring consistent reporting on the nation’s vegetation resources.
Top Resources
Overview of the USNVC Database
ESA’s collection of USNVC resources, including fact sheets, presentations, webinars, and posters.
4.3.3 National Wetlands Classification System (?) (NWCS) [emoji]
What is it?
The primary objective of the Classification of Wetlands and Deepwater Habitats of the United States, as originally drafted by Cowardin et al. (1979:3), was “to impose boundaries on natural ecosystems for the purposes of inventory, evaluation, and management.” The FGDC Wetlands Classification Standard (WCS) provides minimum requirements and guidelines for classification of both wetlands and deepwater habitats that are consistent with the FGDC Wetlands Mapping Standard (FGDC-STD-015-2009).
Why?
NWCS was developed to support a detailed inventory and periodic monitoring of the Nation’s wet habitats using remote sensing.
It has been an official National Standard since 1996 (FGDC-STD-004), and has been the de facto standard for mapping U.S. wetlands and deepwater habitats since 1976
The NVC and Wetlands standard is endorsed as a Federal Geographic Data Committee (FGDC) standard in the U.S. for aquatic environmental data so that data collectors and analysts can describe data using standard terminology and organization structures.
Top Resources
Wetland Classification codes in table and tool formats
The second edition of Classification of Wetlands and Deepwater Habitats of the United States, which outlines the underlying concepts, definitions, systems, and sub-systems.