Contributing¶
Please contribute to this site if you have input. We welcome pull requests.
Additionally, the community has noted the following specific needs for input or experimentation:
Identifying commonalities across communities, organizations, and source data formats
Performance analysis on a variety of data organizations, analyses and data structure types
Chunking and compression options in the context of scalable data access to model output
Data on how optimization decisions vary between different access clients like remote users, dask, and spark clusters
Provenance – how to maintain a record of changes as the data gets reformatted and repacked
Maintain a src link back to origin
When data is broken into granules, how do you identify a singular dataset that comes from an origin?
Micro-changes to data – do these qualify as a completely new version
Provenance chaining as data is subsetted, combined, recombined, reformatted going hand to hand.