Newsletter | Spring 2015 ⋅ Number 65

Data Deposition and Annotation

In the first quarter of 2015, 2887 experimentally-determined structure coordinate entries were deposited to the archive.

85.9% were deposited with a release status of hold until publication; 8.7% were released as soon as annotation of the entry was complete; and 5.4% were held until a particular date.

92.9% of these entries were determined by X-ray crystallographic methods; 5.4% were determined by NM R methods.

During the same period, 2190 structures and 147 EMDB maps were released in the PDB.

wwPDB launched the Deposition Tool for structures determined using X-ray crystallography on January 27, 2014 as part of a new Deposition and Annotation System. Using this system, more than 5,000 structures have been deposited and annotated, and more than 2,400 structures have been released in the archive.

Features of the new system include use of the PDBx/mmCIF data format, which produces more uniform data; the ability to replace data files pre- and post-deposition; enhanced communication; improved annotation; and geometric and experimental data checking based on recommendations from expert task forces. Detailed information and video tutorials are available.

As a result of this successful release, ADIT has been retired at RCSB PDB and PDBj for new depositions of structures determined from X-ray crystallographic experiments.

Existing, in-progress ADIT sessions of X-ray crystallographic structures can be accessed until July 19, 2015.

ADIT will continue to accept depositions from other experimental methods. Deposition tools for NMR and 3DEM are being developed by the wwPDB.

Questions and comments should be sent to info@wwpdb.org.

The Chemical Component Dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank

John D. Westbrook, Chenghua Shao, Zukang Feng, Marina Zhuravleva, Sameer Velankar, Jasmine Young (2014) Bioinformatics doi:10.1093/bioinformatics/btu789

The Chemical Component Dictionary (CCD) is a chemical reference data resource that describes all residue and small molecule components found in PDB entries. The CCD contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, chemical descriptors, systematic chemical names and idealized coordinates.

This paper describes the content, preparation, validation and distribution of the CCD chemical reference dataset.

The CCD can be accessed from wwpdb.org/data/ccd, and can be searched and browsed using resources such as Ligand Expo.

Example CCD definition for 4-hydroxyproline from Figure 2 from Westbrook et al., reprinted with permission from Oxford University Press. (a) Molecular-level information; (b) Abbreviated example illustrating the atom-level features including atom names, stereochemical and aromatic molecular identifiers, experimental model coordinates, computed ideal coordinates; (c) An abbreviated example illustrating the bond-level features including bonding atom names, bond type, stereochemical and aromatic molecular identifiers; (d) SMILES and InChI descriptors, systematic chemical name identifiers and the programs and versions used to compute each name and descriptor.