PDB COMMUNITY FOCUS: JOHN L. MARKLEY, BMRB AND CENTER FOR EUKARYOTIC STRUCTURAL GENOMICS
John L. Markley received a Ph.D. in Biophysics from Harvard
University in 1969 where he worked with Oleg Jardetzky and Elkan
R. Blout. His graduate research included NMR studies of helix coil
transitions in polyamino acids and the preparation and NMR
investigation of selectively deuterated proteins. The latter project
was made possible through funding and access to excellent research
facilities at the Merck Research Laboratories in Rahway, New Jersey,
where he spent 30 months. As a National Institutes of Health
Postdoctoral Fellow with Melvin P. Klein at the University of
California, Berkeley, he made the transition from continuous-wave to
pulse Fourier transform NMR spectroscopy and investigated NMR
relaxation mechanisms. He joined the faculty of the Chemistry
Department at Purdue University as an Assistant Professor in 1972, and
by 1981 had moved up the ranks to Professor. He relocated to the
Biochemistry Department at the University of Wisconsin-Madison in
1983, where he founded the National Magnetic Resonance Facility at
Madison (1985), the BioMagResBank (BMRB; 1990), and the Center for
Eukaryotic Structural Genomics (2000). He currently is Steenbock
Professor of Biomolecular Structure and chairs the Graduate Program in
Biophysics at the University of Wisconsin-Madison.
Q. What is the history behind the BMRB?
A. NMR is unique among biophysical approaches in its ability to
provide a broad range of atomic-level information relevant to the
structural, dynamic, and chemical properties of biological
macromolecules. Since my days as a graduate student, I have been
deeply impressed by the value of chemical information available from
NMR data assigned to specific sites in proteins. In 1984, Eldon Ulrich
and I wrote a comprehensive review of all assigned chemical shifts in
proteins, which required digging information from individual
publications in the literature. In 1985, during a mini-sabbatical with
the late Professor Yoshimasa Kyogoku at the Protein Research Institute
in Osaka, Japan, I formulated the idea for organizing a publicly
available data bank for assigned protein NMR parameters. In the
meantime, the first NMR structures of proteins began
appearing. Ulrich, Kyogoku, and I refined the idea for a data
repository for NMR information about protein structure and dynamics
and published a proposal in 1989. We secured funding for a pilot study
from the National Library of Medicine, which enabled us to begin
developing data models. These initially took the form of flat files
with a rigid format. Later, BMRB adopted a relational database format
and following discussions with Helen Berman and other developers of
mmCIF (which is a variant of the STAR format devised by Sydney Hall
and Nick Spadaccini) eventually developed the current NMR-STAR
format. The data exchange format created at BMRB has gained rapid and
widespread acceptance in the community. NMR-STAR is extensible and
thus can accommodate the addition of new NMR parameters of interest to
the biomolecular NMR community. Its tag-value nature and the tabular
organization of the data model make it easy to interconvert NMR-STAR
with relational or XML formats. BMRB's holdings have grown to include
chemical shifts, J-couplings, relaxation rates, residual dipolar
couplings, and chemical information derived from NMR investigations
(such as hydrogen exchange rates, pKa values, and structural
restraints). New types of data collected are raw (time domain) data
for structure determinations (largely from structural genomics
centers) and solid-state NMR data. Our early idea was that BMRB
annotators would gather and enter data from the literature, but the
enormous growth in the biomolecular NMR field and the reluctance of
journals to publish this information soon made this impractical. As
with the PDB, BMRB relies on depositions from scientists. We are
grateful that the NLM has funded BMRB continuously since its
founding. BMRB now has mirror sites in Florence, Italy, and Osaka,
Japan. The Osaka site is beginning to take responsibility for data
depositions from that part of the world.
Q. What are the interactions with the BMRB and the RCSB PDB?
A. BMRB is a member of the RCSB and has close ties with the PDB. Our
interactions are warm and collegial and span common interests in
standards for data representation, software development, and data
interchange. Recent collaboration with PDB has centered on unifying
the nomenclature used for X-ray and NMR structures and the underlying
data. BMRB and PDB have been pursuing the common goal of providing the
tools for harvesting the full range of information in digital form
that constitutes the normal 'Methods' section of an article in a
journal such as the J. Biol. Chem. BMRB has adapted the PDB ADIT
deposition software for data entry as a way of making data entry more
uniform across the two data banks. Currently, data relevant to BMRB
associated with NMR structures deposited at PDB are transferred
automatically to BMRB for processing. BMRB and PDB are close to
releasing jointly developed software that will provide one-stop data
entry for NMR structures. This will simplify the task of depositors as
well as streamline the work of annotators at BMRB and PDB.
Q. Does your work in structural genomics influence your work with the
BMRB, and vice versa?
A. The goals of structural genomics are to enlarge knowledge of
sequence-structure-function interrelationships and to lower the costs
of solving structures. At the same time, its technology and products
are to be made available to the community. Structural genomics both
reinforces the kind of work that has gone on at the BMRB and the PDB,
and offers challenges to these data banks. In some ways, the data
dictionary work at BMRB and PDB anticipated many of the demands of
structural genomics. Both data banks were well prepared to handle
increasing numbers of structures and the increasing level of detail
about the experiments demanded by structural genomics. The challenges
have been to provide streamlined data deposition and pre-validation
tools. BMRB has worked closely with all structural genomics centers
that utilize NMR spectroscopy to determine their needs and to seek
suggestions for improving the operations of the data bank. These
interactions have led to new developments at BMRB, such as the use of
validation software developed at the Northeast Structural Genomics
Consortium, the extension of NMR-STAR format for chemical shifts to
include probabilistic assignments as developed at the National
Magnetic Resonance Facility at Madison, and the repository of
collections of time-domain data sets used in structure determinations.
Also, in response to the structural genomics community, BMRB has been
increasing its links to other sites on the Web.
Q. The amount of experimental data to be archived is growing
exponentially. What do you see as the future for managing this data?
A. We are confident of our ability at BMRB, given the current level of
approved funding, to keep up with the growth in the field. Our
strategy at BMRB is to develop procedures and software that enable us
to manage data more efficiently. The new data deposition system
mentioned above, which jointly handles coordinates as well as
underlying data and information about the biological system and
experiments conducted, allows for more automated data harvesting and
validation. I anticipate that BMRB, like PDB, will become increasingly
internationalized. The new data deposition site at Osaka represents a
positive step in this direction. At first, data collected at Osaka
will be transferred to Madison for annotation, but the plan is to
develop this capability in Osaka.
Q. The size and complexity of structures being solved by X-ray and EM
methods continues to increase over time. Is there a limit to the size
of structures that can be investigated with NMR methods?
A. As recent publications from the laboratories of Kurt Wuthrich
and others have shown, it is difficult to place an absolute molecular
weight limit on NMR structures. With conventional methods of uniform
13C +15N labeling, the practical limit for high-throughput NMR
structure determinations is about 20 kDa. However, by expending
additional effort and by utilizing 2H labeling, the bar can be raised
to 30 kDa and above. Recent results from Masatsune Kainosho's
laboratory with stereo array isotope labeling suggest that this
approach could increase the practical limit to 40 kDa. Selective
labeling approaches also are increasing the sizes of RNA structures
that can be solved by NMR. It is important to recognize that NMR can
provide useful information about structure-function relationships even
in the absence of a three-dimensional structure. BMRB captures this
kind of information, which can be obtained from systems 150 kDa and
larger.
|