Skip to content

Data Access and Policy

Data generated by C-CoMP laboratories will be stored in repositories that provide DOI numbers to ensure persistent access to our products, unless otherwise indicated. All repositories (see below) provide immediate access to the broader community. Metabolite data will be deposited in either MetaboLights or the Metabolomics Workbench. Field data from the Bermuda Atlantic Time-series (BATS) is stored at the BATS ftp website and at the NSF-supported Biological & Chemical Oceanography Data Management Office (BCO-DMO) repository. Processed protein data will be stored at BCO-DMO and accessed through the Ocean Protein Portal. Proteomics raw mass spectral data will be submitted to ProteomeXchange via Massive or Pride. Sequence data will be stored at NCBI on the Sequence Read Archive (SRA), iMicrobe, and/or iVirus. Raw model data will be stored on local servers initially. For CESM results, we will adhere to their data policy, which includes data release within one year of generation. In general, C-CoMP will meet the NSF guidelines on data release within 2 years of generation but will release any hardened data products before then, if possible. All software development efforts in C-CoMP will follow open source software development practices, will be licensed through General Public License, and the source code of stable releases as well as the development branches will be accessible to the community through GitHub repositories. Stable releases of our software will also be available as Conda packages through the Anaconda Package Repository and as Docker containers through the Docker Hub for platform independent, easy-to-install or easy-to-run scenarios.

Strategies for C-CoMP data integration, access, and serving. Figure credit: C-CoMP Team.
Strategies for C-CoMP data integration, access, and serving. Figure credit: C-CoMP Team.
Data typeRepository/website
Raw metabolite dataMetaboLights and/or Metabolomics Workbench
Field dataBATS, BCO-DMO
Processed protein dataBCO-DMO via Ocean Protein Portal
Raw proteomics dataProteomeXchange
Genomic Sequence data (16S rRNA, Whole Genome Sequencing, Shotgun Metagenomics)NCBI SRA, iMicrobe, iVirus
Open source softwareGithub, Anaconda, Docker
CESM resultsTBD