Open Data: Where to Start?
Exploring where to contribute data? Check out those marked with (✝)
This page provides an extended list of data repositories across disciplines.
For a concise, one-page handout used at the in-person Data Help Desk, see the curated PDF: Download the one-pager
Developed in collaboration with the ESIP Open Science Cluster
Table of Contents
- Multidisciplinary Repositories
- Federated Portals
- Cryosphere & Polar Data
- Ocean & Marine Data
- Geophysical, Geospatial & Geodetic Infrastructure
- Ecological & Terrestrial Observations
- Biological & Specimen Repositories
- Mineralogy Databases
- Sample and Core Repositories
- Indigenous Knowledge & Knowledge Systems
- Paleoscience Data
Multidisciplinary Repositories
- ✝ DataONE – Cross-repository discovery for Earth & environmental science data.
- ✝ Figshare – Provider of open research repository infrastructure.
- ✝ Mendeley Data – Free and secure cloud-based communal repository.
- ✝ The Open Science Framework (OSF) – Collaboration tool to support researchers throughout their entire project lifecycle.
- ✝ PANGAEA – Digital data library and a data publisher for earth system science.
- re3data – Registry of research data repositories (searchable index).
- ✝ Zenodo / Dryad – Generalist, FAIR-compliant repositories for a range of disciplines.
Federated Portals
- ✝ Ag Data Commons – Centralized data catalog and repository for agricultural research data from the USDA.
- Earthdata – NASA’s portal for discovering, accessing, and visualizing Earth science data.
- NIST Science Data Portal – Publicly available datasets at NIST for Science, Engineering & Technology.
- National Center for Atmospheric Research (NCAR) – Earth system science datasets including meteorological, atmospheric, and oceanographic observations and model outputs.
- ✝ NOAA Centers for Environmental Information (NCEI) – Comprehensive gateway to NOAA’s environmental data access tools and APIs.
- UN World Environment Situation Room – UNEP platform providing data, information, and knowledge on global environmental conditions.
- USGS ScienceBase – Digital repository for scientific data products and resources from USGS programs.
Cryosphere & Polar Data
- Antarctic Meteorological Research and Data Center (AMRDC) – Archived meteorological data and observations from Antarctic research.
- ✝ National Snow and Ice Data Center (NSIDC) – Cryospheric data and related geophysical data.
- UMN Polar Geospatial Center (PGC) – Polar satellite imagery, digital elevation models, and historical maps.
- ✝ USAP-DC – Datasets derived from The NSF Antarctic program.
Ocean & Marine Data
- ✝ BCO-DMO – Data and information from biological, chemical, and biogeochemical marine research.
- ✝ CCHDO – High-quality hydrographic data from repeat hydrography programs (WOCE, GO-SHIP).
- ✝ MGDS – Marine geophysical and bathymetric data and metadata.
Geophysical, Geospatial & Geodetic Infrastructure
- ✝ EarthChem – Geochemical, geochronological and mineralogical data.
- EPA Geospatial Data – Open geospatial datasets for environmental decision-making.
- Geodetic Facility for the Advancement of Geoscience (GAGE) – Geodetic data for measuring Earth’s surface motion and deformation.
- ✝ OpenTopography – High-resolution topographic data and tools.
- ✝ Seismological Facility for the Advancement of Geoscience (SAGE) – Seismological data, including ground motion, atmospheric, infrasonic, and more.
- Sedimentary Geochemistry and Paleoenvironments Project (SGP) – Portal for sedimentary geochemical data with sample-level context (lithology, fossils, stratigraphy, geochronology, depositional environment).
Ecological & Terrestrial Observations
- CZNet – Environmental data from the Critical Zone Collaborative network.
- ✝ Environmental Data Initiative (EDI) – Ecological and environmental data, primarily from the LTER program.
- ✝ ESS-DIVE – Repository for environmental systems science data from DOE’s Office of Science BER program.
- ✝ HydroShare – Collaborative environment for sharing hydrologic data and models.
- ✝ Knowledge Network for Biocomplexity (KNB) – Repository for ecological and environmental data from NCEAS working groups.
- ✝ Water Quality Portal – Water quality data from USGS NWIS and EPA WQX Data Warehouse.
Biological & Specimen Repositories
- ✝ GenBank – Annotated collection of all publicly available NIH DNA sequences.
- ✝ Global Biodiversity Information Facility (GBIF) – Biodiversity occurrence and observational data aggregator.
- iDigBio – Images and metadata for millions of biological specimens.
- ✝ Morphobank – Repository with information on anatomy, physiology, and behavior of species.
- ✝ Movebank Data Repository – Animal tracking and animal-borne sensor datasets.
- NASA Open Science Data Repository (OSDR) – Data related to responses of terrestrial life to spaceflight.
- ✝ Ocean Biodiversity Information System (OBIS) – Marine biodiversity occurrence records.
Mineralogy Databases
- Mindat.org – Crowdsourced, expert-curated mineralogy database with species, localities, photographs, and references.
- ✝ RRUFF Project – Mineral database combining empirical analyses (spectra, structures, chemistry) with images and an open-access reference library.
Sample and Core Repositories
- International Ocean Discovery Program (IODP) – Ocean sub-seafloor drilling cores samples and data.
- NSF Ice Core Facility (NSF-ICF) – Meteoric ice cores.
- OSU Polar Rock Repository (PRR) – Antarctic rock samples + metadata.
- Oregon State Marine and Geology Repository – Marine cores, rock samples, and dredged material.
- Seafloor Samples Lab – Archived marine geological samples.
- ✝ System for Earth and Extraterrestrial Sample Registration (SESAR) – Global digital index of samples and specimens.
Indigenous Knowledge & Knowledge Systems
- ELOKA – Exchange for Local Observations and Knowledge of the Arctic.
- ✝ Native Land Digital (NLD) – Digital maps of Indigenous territories, languages, and treaties.
Paleoscience Data
- ✝ Neotoma Paleoecology Database – Fossil, paleoecological, and paleoenvironmental data resource.
- NOAA Paleoclimatology – World’s largest archive of climate and paleoclimatology data.
- Paleobiology Database (PBDB) – Collection-based occurrence and taxonomic data for paleontology.
Data Help Desk Playbook