External Data Sources
External Data Resources
GeneWeaver contains publicly available sets of genes annotated to structured vocabularies and ontologies that are assigned Tier I, or public resource data. Other sets of genes, such as MeSH term-to-gene annotations, are derived from the processing of public sources and attributed to Tier II. In the case of MeSH, we take advantage of NCBI's gene-to-Pubmed and Pubmed-to-mesh files to produce sets of genes annotated through their transitive associations.
Tier | Resource | Description | Number of Gene Sets (2012) | Number of Gene Sets (2015) | Number of Gene Sets (2018) |
---|---|---|---|---|---|
1 | Allen Brain Atlas (ABA) | Sets containing upregulated genes found within mouse brain regions and structures. These genes exhibit a >= 2.0 fold change in expression energies compared to all other basic cell groups and brain regions (ABA refers to this area as 'grey' contrast structures). These sets are generated using the ABA API and its differential gene search pipeline. | 785 | 740 | 785 |
1 | Comparative Toxicogenomics Database (CTD) | Sets of genes associated with CTD chemical-gene interactions are obtained via CTD flat files. | 6266 | 6177 | 21630 |
1 | Drug Related Gene Database (DRG) | Drug Related Gene Database, compiled bt the Neuroscience Informatics Framework (NIF) contains gene expression data related to drug abuse research. | 1208 | 253 | 238 |
1 | Human and Mouse Gene Ontology (GO) | Sets of genes from human and mouse annotated to the Gene Ontology (GO), obtained from the Gene Ontology Consortium and MGI. | 33668 | 33668 | 85573 |
1 | Human Phenotype Ontology Annotations (HP) | Gene sets derived from annotations of genes to HPO. | 6276 | 4011 | 6276 |
1 | Kyoto Encyylopedia of Genes and Genomes (KEGG) | Pathways derived from the KEGG API are directly parsed for identifiers that map to GeneWeaver. Pathway data for humans, mice, rats, and rhesus monkeys is currently included. | 0 | 1172 | 1339 |
1 | Mammalian Phenotype Annotations (MP) | Gene sets derived from annotations of mutant mice to MP terms in MGI, with transitive closure. | 7966 | 7966 | 7931 |
2 | Medical Subject headings (MeSH) | Genes annotated to MeSH terms were aggregated with gene2publication associations from PubMed. Associations must appear in a minimum of two publications. Genes associated with the closure of each set were obtained. | 0 | 12069 | 12069 |
1 | Molecular Signature Database (MSigDB) | Sets of genes annotated to disease for use with Gene Set Enrichment Analysis (GSEA) downloaded from MSigDB v.5.0. Only sets derived from hallmark, C1, C3, C4, C6, and C7 collections are incorporated*. MSigDB genesets that are curated from other resources (e.g. KEGG or GO) are ignored to eliminate data redundancy. | 0 | 3738 | 3738 |
1 | MouseQTLs from MGI | Sets of positional candidate genes for the confidence interval around all the QTLs within MGD. | 0 | 5050 | 3405 |
1 | Online Mendelian Inheritance in Man (OMIM) | Gene-disease phenotype data is retrieved from OMIM's Morbid Map and Phenotype Series list. Unconfirmed and spurious mappings are ignored. | 0 | 738 | 738 |
1 | Pathway Commons (PC) | Sets of genes derived from the "top" pathways: those that are neither controlled nor a pathway component of another biological process. KEGG pathways are removed from this data set to prevent duplicate genesets. | 0 | 1036 | 1149 |
1 | Rat QTLs from RGD | Sets of positional candidate genes for the confidence interval around all the QTLs within the RGD. | 0 | 2048 | 2064 |
1 | Genome Wide Association Studies (GWAS) | Catalog of Published Genome-Wide Association Studies | 0 | 0 | 3389 |
*Information on the MSigDB file types included in GenWeaver (H, C1, C3, C4, C6 and C7)