Resources for Cloud Data Optimization¶
Data Formats¶
Cross-Format Evaluations¶
Task 51 - Cloud-Optimized Format Study - A NASA cross-format evaluation on the performance characteristics of several formats in cloud environments.
Cloud-Optimized GeoTIFFs (COGs)¶
Landsat Cloud Optimized GeoTIFF Data Format Control Book: Details on format selections and packaging for Landsat Collection 2.
COG Talk Series, starting with COG Talk — Part 1: What’s new?. This blog is the first in a series… | by Vincent Sarago | Development Seed
HDF5¶
Highly Scalable Data Service (HSDS): A REST-based web service for HDF5 data stores, built by The HDF Group, with several optimizations for data access in network environments.
H5Coro: The Cloud-Optimized Read-Only Library
ICESat-2 SlideRule (Science data processing as a service) H5Coro Documentation: The most detailed explanation of H5Coro
Storage Guidance¶
Performance Guidelines for Amazon S3: Amazon suggestions to optimize performance on S3.
Benchmarking¶
Pangeo benchmarking: Benchmarking and scalng studies of the Pangeo Platform.
Chunking¶
Making earth science data more accessible: experience with chunking and compression: Presentation by XX of unidata provides explanation of what chunking is, why chunking is important for big data access, and guidance for choosing the right chunk shape.
Categorical Data Standards¶
Communities of geospatial data develop data standards in order to facilitate the adoption and sharing of data and code across users and platforms. Cloud data providers may see greater data use if providing data adhering to these standards. Below are some examples of data standards for a scoped category of geospatial data.
Standard for the Exchange of Earthquake Data (SEED) Communitee on Earth Observing Satellites (CEOS) Analysis Ready Data (ARD) for Land (CARD4L)