Human Multi-Omic Data

Highlighted Datasets with Summary-Level Data

FaceBase
Several human developmental and phenotype-associated multi-omic datasets are available via FaceBase. Browse a full list.

Steps for Access: Summary-level human datasets are downloadable from the FaceBase website. Individual-level human data can be requested through controlled access policies detailed here.

Cost to Access: None

Other NIDCR Funded Resources

Sjögren’s International Collaborative Clinical Alliance (SICCA)

Proteomic Data Commons (PDC)

Human Salivary Proteome
Collaborative, community-based web portal of human saliva proteins identified by high-throughput proteomic technologies.

Human Oral Microbiome Database (eHOMD)
eHOMD provides comprehensive, curated information on bacteria in the human mouth and aerodigestive tract, including the pharynx, nasal passages, sinuses, and esophagus.

Candida Genome Database (CGD)
The CGD is a resource for genomic sequence data, as well as gene and protein information for Candida albicans and related species.

NIH-Wide and Other Data Sources with -Omics Data

Human Tumor Atlas Network (HTAN)
The Human Tumor Atlas Network is a Cancer Moonshot initiative funded by the National Cancer Institute to construct three-dimensional atlases of the dynamic cellular, morphological, and molecular features of human cancers as they evolve from precancerous lesions to advanced disease.

dbGaP (297 NIDCR funded phenotype datasets)
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute data and results from studies that have investigated the interaction of genotype and phenotype in humans.

Gene Expression Omnibus (GEO)
GEO is a public functional genomics data repository supporting MIAME-compliant data submissions. Array- and sequence-based data are accepted. Tools are provided to help users query and download experiments and curated gene expression profiles.

The Cancer Genome Atlas (TCGA)
The Cancer Genome Atlas, a landmark cancer genomics program, molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types. This joint effort between the National Cancer Institute and the National Human Genome Research Institute began in 2006, bringing together researchers from diverse disciplines and multiple institutions.

Pharos
Pharos is the user interface to the Knowledge Management Center (KMC) for the Illuminating the Druggable Genome program funded by the National Institutes of Health Common Fund. The goal of the KMC is to develop a comprehensive, integrated knowledge-base for the Druggable Genome (DG) to illuminate the uncharacterized and/or poorly annotated portion of the DG, focusing on three of the most commonly drug-targeted protein families: G-protein-coupled receptors (GPCRs), ion channels (ICs), and kinases.

cBioPortal
The cBioPortal for Cancer Genomics was originally developed at Memorial Sloan Kettering Cancer Center (MSK). The public cBioPortal site is hosted by the Center for Molecular Oncology at MSK. The cBioPortal software is now available under an open source license via GitHub. The software is now developed and maintained by a multi-institutional team, consisting of MSK, the Dana-Farber Cancer Institute, Princess Margaret Cancer Centre in Toronto, Children's Hospital of Philadelphia, Caris Life Sciences, The Hyve, and SE4BIO in the Netherlands, and Bilkent University in Ankara, Turkey.

European Genome-Phenome Archive (EGA)
EGA is a service for permanent archiving and sharing of personally identifiable genetic, phenotypic, and clinical data generated for the purposes of biomedical research projects or in the context of research-focused healthcare systems.

Human Metabolome Database (HMDB)
HMDB is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. It is intended to be used for applications in metabolomics, clinical chemistry, biomarker discovery, and general education. The database is designed to contain or link three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data.

SalivaDB—A comprehensive database for salivary biomarkers in humans
SalivaTecDB is a database dedicated to the annotation and characterization of proteins, miRNAs, and microorganisms present in the oral cavity, associated with oral or systemic mechanisms. This database has been developed by the SalivaTec team as a tool to be used by researchers working in salivary diagnostics. SalivaTec's main goal is to propose biomarker panels suitable for saliva, a non-invasive diagnostic fluid.

ProteomicsDB
ProteomicsDB is a multi-omics and multi-organism resource for life science research. It covers various types of data, including proteomics, transcriptomics, and phenomics data for organisms such as humans, mice, Arabidopsis, and rice. Different visualizations are available, allowing for protein- and drug-centric interrogation, as well as combined analysis through our analytics section.

Last Reviewed
April 2025

Human Multi-Omic Data

On this page

Highlighted Datasets with Summary-Level Data

Other NIDCR Funded Resources

NIH-Wide and Other Data Sources with -Omics Data