Accessing the PDB Archive
The PDB archive has been remediated by wwPDB members the RCSB
PDB, MSD-EBI, PDBj, and the BMRB. It can be accessed from
ftp://ftp.wwpdb.org.
New files processed and released into the archive by the wwPDB
sites will reflect the new features incorporated as part of
this project, including standardized IUPAC1 nomenclature for
all chemical components.
Users may have to download new software to properly view the
files with the new nomenclature (e.g., RasMol, Chimera)2. Links
to resources are available at
www.wwpdb.org.
A snapshot of the unremediated PDB archive (as of July 31,
2007) is available at
ftp://snapshots.rcsb.org.
Remediation of the Entire PDB Archive
Highlights of the types of information improved through
remediation include:
Sequence
|
Updated references to databases and taxonomies
Resolved differences between chemical and
macromolecular sequences
|
Citation
|
Verified and updated primary citation assignments
|
Assembly and Virus Information
|
Improved representation of deposited and experimental
coordinate frames, symmetry, and frame transformations
|
Nucleic Acid Labeling
|
Deoxy and ribose nucleotides assigned seperate chemical
definitions
|
Beamline Data
|
Beamline and synchrotron facilty names have been made
consistent with BioSync
|
Chemical Components
|
Standardization of chemistry and nomenclature in monomers
and ligands
|
Remediated data are available for each PDB entry in three
formats:
-
mmCIF
(mmcif.pdb.org). All remediation work was done using the PDB
Exchange Dictionary (PDBx) that follows the mmCIF syntax.
-
PDBML-XML (pdbml.pdb.org). Remediated data files are also
available in PDBML-XML format, in a direct translation from
the files in mmCIF format.
-
PDB File Format (wwpdb.org). The remediated files have been
released in PDB File Format version 3.0. This version of the
file format incorporates standardized atom nomenclature, and
distinguishes deoxyribonucleic acid from ribonucleic acid.
The Chemical Component Dictionary
Image of 407d 3 created using the remediated data file and the
latest patch to OpenRasmol (2.7.3.1)
The Chemical Component Dictionary (formerly known as the "HET
dictionary") describes all residues in the PDB, standard and
non-standard, and all small molecules. It has been remediated
to address the inconsistencies in older dictionary entries that
resulted in valence problems, missing model coordinates,
redundant ligands, and more.
The features of this dictionary include:
-
Standard nomenclature
-
Model coordinates have been corrected, redundant chemical
components obsoleted, and additional definitions for
protonated forms are provided.
-
Stereochemical assignments, aromatic bond assignments,
idealized coordinates, chemical descriptors (SMILES &
InChI)4, and systematic chemical names have been added.
The full Chemical Component Dictionary and the companion Amino
Acid Variants Dictionary can be downloaded from
remediation.wwpdb.org/
downloads.html.
Users can also search for individual chemical components,
either by entering the component ID in the form provided, or by
browsing by ID. The variant dictionary can also be browsed.
For each chemical component in the dictionary, a summary page
provides a 2D chemical diagram and 3D graphic (using Jmol) of
the ligand. This page also describes the ligand's physical and
chemical features of the ligand. Status information along with
links to the component definition in CIF and PDBML/XML formats,
model coordinates, idealized coordinates, and chemical diagrams
are provided.
Accessing the Remediated Data from the RCSB PDB Website
The latest release of the RCSB PDB website utilizes the data
from the wwPDB Remediation Project.
This new site offers:
-
Improved searching and reporting capabilities
-
Updated sequence references
-
Updated primary citation information and links
-
Better representations for complex assemblies (such as
viruses)
-
Access to remediation data and pre-remediation data
-
Advanced access to ligand information
-
Enhanced sequence details page for each structure
1 Pure & Applied Chem., 70, 117-142, 1998
2 see sourceforge.net/projects/openrasmol
and www.cgl.ucsf.edu/chimera
3 407d: C.L. Kielkopf, S. White, J.W Szewczyk, J.M. Turner, E.E. Baird, P.B. Dervan, D.C. Rees (1998)
A structural basis for recognition of A-T and T-A base pairs in the minor groove of B-DNA. Science 282:111-115
4 D. Weininger (1988) SMILES, a chemical language and information system.
1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31 - 36 and
© The International Union of Pure and Applied Chemistry (2005)
IUPAC International Chemical Identifier (InChI) (contact: secretariat@iupac.org)
|