The growing field called Metabolomics detects and quantifies the low molecular weight molecules, known as metabolites (constituents of the metabolome), produced by active, living cells under different conditions and times in their life cycles. NMR is playing an important role in metabolomics because of its ability to observe mixtures of small molecules in living cells or in cell extracts.
Genomics is a science that attempts to describe a living organism in terms of the sequence of its genome (its constituent genetic material). Genomics uses the techniques of molecular biology and bioinformatics to analyze the sequences attributed to structural genes, regulatory sequences, and even noncoding sequences. Genomics is closely related to, and sometimes considered a branch of, Genetics: the study of genes and heredity.
Proteomics focuses on identifying when and where proteins are expressed in a cell so as to establish their physiological roles in an organism.
Structural Genomics is a worldwide effort aimed at determining the three-dimensional structures of gene products in an efficient and high-throughput mode. When the focus is on proteins, this effort may be called Structural Proteomics. Whereas a structural biologist may work to thoroughly understand the structure and function of one, or maybe a few proteins, structural genomics efforts focus on determining the structures of large numbers of proteins without prior regard to function. Several structural proteomics groups pursue the structures of proteins that are "unique", generally ones that have less than 30% sequence identity to a protein with a known structure in the Protein Data Bank. The objective here is to enlarge our understanding of sequence-fold relationships so that we are better able to predict structures from sequences. Other structural proteomics centers have the goals of determining structures of all proteins from a given organism or all structures of a particular class or family of proteins.
The Protein Data Bank (PDB) is the international repository for biomolecular structure data. To find more information about protein targets studied by structural genomics efforts, including the target progress, protocols, structures, annotations, models, and DNA clones, visit the PSI Structural Genomics Knowledgebase.
TargetTrack, originally TargetDB, is a protein target and protocol registration database that was developed and hosted at Rutgers University to register and track information for the NIH P50 funded structural genomics centers. The database has since grown to include data contributed internationally and has merged with the Proteing Expression database PepcDB into a single resource. It is funded by the US National Institutes of Health (NIH) through its Protein Structure Initiative (PSI).
Structural genomics efforts are producing a wealth of experimental data from NMR studies that are linked to high-quality three-dimensional structures of proteins. The "rules" of the international structural genomics effort mandate that all data be deposited in a timely fashion. This includes coordinates of three-dimensional structures deposited at PDB and for NMR structures, chemical shifts, coupling constants, and other relevant data at BMRB. In addition, some of the structural genomics centers are depositing the original NMR spectra as time-domain data sets in BMRB. These data sets will allow structures to be recalculated by others who may wish to practice their skills or test novel methods for structure determination from NMR data. Data from structural genomics centers are valuable to BMRB because they are enlarging the set of NMR parameters and three-dimensional structures determined under comparable conditions.
NMR chemical shift data already are being used to determine secondary structure in proteins and to set limits on a protein's conformation. As BMRB's pool of assigned NMR data associated with structures increases, it will become easier to determine structures of proteins from sparse data sets. These data will be important for determining structures of larger proteins for which it is difficult to obtain full data sets.
To make the pool of data most useful, it is important for the NMR community as a whole to deposit their data and for these data to be in a standard, usable format.
The mission of BMRB is to collect, archive and disseminate the quantitative data derived from NMR spectroscopic investigations. The high throughput mode of structural Genomics investigations means that those projects are major contributors to BMRB. BMRB has developed standards for formatting and data definitions specified for Structural Genomics.