Database

Software

Since 2006 the Wishart lab at the University of Alberta have been developing and releasing freely available metabolomic databases, programs and web servers to the metabolomics community. These were developed to address persistent problems in compound identification, compound annotation, data reduction, data analysis, biological interpretation and knowledge translation. This page provides hyperlinks, synoptic descriptions and PubMed abstract links to all of the metabolomic databases and software developed by the Wishart lab and its collaborators. The management and development of these resources was initially done under the auspices of the Bioinformatics Help Desk (a Genome Canada funded core facility) and the Human Metabolome Project from 2006--2010 but it is now handled through the support of TMIC and CIHR. Please feel free to use these tools. We welcome your feedback.



Databases

The Human Metabolome Database (HMDB) is a freely available database containing detailed information about small molecule metabolites found in the human body. It is intended to be used for applications in metabolomics, clinical chemistry, biomarker discovery and general education. The database is designed to contain or link three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data. HMDB contains over 7900 metabolite entries including both water-soluble and lipid soluble metabolites as well as metabolites that would be regarded as either abundant (> 1 uM) or relatively rare (< 1 nM). Additionally, approximately 7200 protein (and DNA) sequences are linked to these metabolite entries.
Pubmed: 18953024

The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains nearly 4800 drug entries including >1,350 FDA-approved small molecule drugs, 123 FDA-approved biotech (protein/peptide) drugs, 71 nutraceuticals and >3,243 experimental drugs. Additionally, more than 2,500 non-redundant protein (i.e. drug target) sequences are linked to these FDA approved drug entries. Each DrugCard entry contains more than 100 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data.
Pubmed: 21059682

FooDB is a database on food constituents, chemistry and biology that has been under development since 2009. It currently has data on 28,500 food compounds and food associations. It is being jointly developed with Dr. Augustin Scalbert (IARC, Lyon). When completed in late 2012 or early 2013 it will be the most comprehensive resource on food composition in the world. It will provide information on both macronutrients and micronutrients, including many of the constituents that give foods their flavor, color, taste, texture and aroma. The link provided here gives some sample pages from the database.

The Toxin and Toxin Target Database (T3DB) is a unique bioinformatics resource that combines detailed toxin data with comprehensive toxin target information. The database currently houses over 2900 toxins described by over 34 200 synonyms, including pollutants, pesticides, drugs, and food toxins, which are linked to over 1300 corresponding toxin target records. Altogether there are over 33 800 toxin, toxin target associations.
Pubmed: 19897546

The Small Molecule Pathway Database (SMPDB) is an interactive, visual database containing nearly 450 small molecule pathways found in humans. These include standard metabolic pathways (90), disease pathways (116), drug pathways (223) and metabolic signaling pathways (13). More than 70% of the pathways in SMPDB are found in no other pathway database (not even KEGG or HumanCyc). SMPDB is designed specifically to support pathway elucidation and pathway discovery in metabolomics, transcriptomics, proteomics and systems biology.
Pubmed: 19948758

The CSF Metabolome database is a freely available electronic database containing detailed information about 468 small molecule metabolites found in human CSF along with 1650 concentration values. The data tables may be sorted and searched by concentration values and ranges. The information includes literature and experimentally derived chemical data, clinical data and molecular/biochemistry data.
Pubmed: 22546835, 18502700

The Urine Metabolome database is a freely available electronic database containing detailed information about ~1400 small molecule metabolites found in human urine along with ~4500 concentration values. The data tables may be sorted and searched by concentration values and ranges. The information includes literature and experimentally derived chemical data, clinical data and molecular/biochemistry data.

The Serum Metabolome database is a freely available electronic database containing detailed information about 4651 small molecule metabolites found in human serum along with 10895 concentration values. The data tables may be sorted and searched by concentration values and ranges. The information includes literature and experimentally derived chemical data, clinical data and molecular/biochemistry data.
Pubmed: 21359215

The CyberCell Database (CCDB) is a comprehensive, web-accessible database designed to support and coordinate international efforts in modeling an Escherichia coli cell on a computer. The CCDB brings together both observed and derived quantitative data from numerous independent sources covering many aspects of the genomic, proteomic and metabolomic character of E.coli (strain K12).
Pubmed: 14681416

The Yeast Metabolome Database (YMDB) is a manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae(also known as Baker’s yeast and Brewer’s yeast). This database covers metabolites described in textbooks, scientific journals, metabolic reconstructions and other electronic databases. YMDB contains metabolites arising from normal S. cerevisiae metabolism under defined laboratory conditions as well as metabolites generated by S. cerevisiae when used in baking and in the production of wines, beers and spirits. YMDB currently contains 2010 small molecules with 857 associated enzymes and 138 associated transporters.
Pubmed: 22064855

The Bovine Metabolome Database (BMDB) The Bovine Metabolome Database (BMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in beef and dairy cattle. The information includes literature and experimentally derived information on bovine meat, bovine serum, bovine milk, bovine urine and bovine ruminal fluid.

E. coli Metabolome Database (ECMDB) is a freely available eletronic database containing detailed information about the >1620 metabolites found in E. coli (strain K12, MG1655). The information includes literature and experimentally derived information on the chemical data, spectral data and the molecular/biochemistry data.

MarkerDB will be a freely available resource that attempts to consolidate information on all known clinical biomarkers into a single source. Multiple types of markers are covered including metabolite based, genetic based, protein based and cell based markers.

Software Tools

MetaboAnalyst MetaboAnalyst is a comprehensive, Web-based tool designed for processing, analyzing, and interpreting metabolomic data. It handles most of the common metabolomic data types including compound concentration lists, spectral bin lists, peak lists, and raw MS spectra. Pubmed: 22553367, 21637195, 21633943

MetATT is a easy-to-use, web-based tool designed for time-series and two-factor metabolomics data analysis. MetATT offers a number of complementary approaches including 3D interactive principal component analysis, two-way heatmap visualization, two-way ANOVA, ANOVA-simultaneous component analysis and multivariate empirical Bayes time-series analysis.
Pubmed: 21712247

MetPA (Metabolomics Pathway Analysis) is a free and easy-to-use web application designed to perform pathway analysis and visualization of quantitative metabolomic data.
Pubmed: 20628077

MSEA is a web-based tool to help identify and interpret patterns of metabolite concentration changes in a biologically meaningful context for human and mammalian metabolomic studies.
Pubmed: 20457745

MetaboMiner is a tool which can be used to automatically or semi-automatically identify metabolites in complex biofluids from 2D NMR spectra. MetaboMiner is able to handle both 1H-1H total correlation spectroscopy (TOCSY) and 1H-13C heteronuclear single quantum correlation (HSQC) data. It identifies compounds by comparing 2D spectral patterns in the NMR spectrum of the biofluid mixture with specially constructed libraries containing reference spectra of approximately 500 pure compounds.
Pubmed: 19040747

PolySearch supports >50 different classes of queries against nearly a dozen different types of text, scientific abstract or bioinformatic databases. The typical query supported by PolySearch is 'Given X, find all Y's' where X or Y can be diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs and metabolites.
Pubmed: 18487273

BioSpider is a robust tool designed to scan the web for chemical and/or biological information. BioSpider brings together data from a large variety of databases, uses its own set of predictive programs, and integrates chemical (metabolite, ligand, cofactor and drug) data into its biological (sequence, function, pathway etc) reporting.
Pubmed: 17990488

Receiver Operating Characteristic (ROC) curves are generally considered the method of choice for evaluating the performance of potential biomarkers. ROCCET is a freely available web-based tool designed to assist clinicians and bench biologists in performing common ROC based analyses on their metabolomic data using both classical univariate and more recently developed multivariate approaches.

MetaboMiner is a program which can be used to automatically or semi-automatically identify metabolites in complex biofluids from 2D NMR spectra. MetaboMiner is able to handle both 1H-1H total correlation spectroscopy (TOCSY) and 1H-13C heteronuclear single quantum correlation (HSQC) data. It identifies compounds by comparing 2D spectral patterns in the NMR spectrum of the biofluid mixture with specially constructed libraries containing reference spectra of approximately 500 pure compounds. Tests using a variety of synthetic and real spectra of compound mixtures showed that MetaboMiner is able to identify >80% of detectable metabolites from good quality NMR spectra.
Pubmed: 19040747