PDB-news-logo
Published quarterly by the Research Collaboratory
for Structural Bioinformatics Protein Data Bank
Summer 2013
Number 58

NEWSLETTER

Data Deposition and Annotation

wwPDB News

SAS Task Force Report Published

The wwPDB Small-Angle Scattering Task Force has published their Report of the wwPDB Small-Angle Scattering Task Force: Data Requirements for Biomolecular Modeling and the PDB Structure (2013) Jill Trewhella, Wayne A. Hendrickson, Gerard J. Kleywegt, Andrej Sali, Mamoru Sato, Torsten Schwede, Dmitri I. Svergun, John A. Tainer, John Westbrook and Helen M. Berman, Structure 21: 875-881 [ doi: 10.1016/j.str.2013.04.020 ]

The first meeting of the Small Angle Scattering (SAS) Task Force was sponsored by the wwPDB and held in July 2012 at the Center for Integrative Proteomics Research at Rutgers, The State University of New Jersey. The Task Force, chaired by Jill Trewhella, includes experts in SAS, crystallography, data archiving, and molecular modeling.

Recognizing the rapidly growing community of structural biology researchers that acquire and interpret SAS data in terms of increasingly sophisticated molecular models, the SAS Task Force made several recommendations. These include: development of a global repository for X-ray and neutron SAS data; creation of a standard dictionary of terms for data collection and for managing the SAS data repository; options for including SAS-derived shape and atomistic models along with specific information regarding the modeling protocol, uniqueness and uncertainty; development of criteria for assessment of data quality and accuracy. The Task Force also recommends that leaders from the various structural biology disciplines should jointly define what to archive in the PDB and what complementary archives might be needed, taking into account both scientific needs and funding.

This report by the wwPDB SAS Task Force follows recommendations recently published by wwPDB Validation Task Forces on X-ray and 3DEM. The report from the NMR Validation Task Force will be published shortly.

Landmark HIV Capsid Structures Follow New PDB Deposition and Release Procedures for Large Structures

Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics (2013) Gongpu Zhao, Juan R. Perilla, Ernest L. Yufenyuy, Xin Meng, Bo Chen, Jiying Ning, Jinwoo Ahn, Angela M. Gronenborn, Klaus Schulten, Christopher Aiken, Peijun Zhang, Nature 497: 643-646 [ doi: 10.1038/nature12162 ]

EMDB entry EMD-5639 is the cryo-electron tomography reconstruction from which 3J3Q and 3J3Y were generated; related entry 3J34, derived from an 8.6 Ångström reconstruction of a capsid hexameric subunit in a helical assembly (EMD-5582), was used in the construction of both 3J3Q and 3J3Y.

Two complete HIV-capsid structures, both of unprecedented size, have been curated and made available in the archive following recently-announced procedures for the deposition and release of large structures. This represents a significant advance in the field of structural biology and a milestone for the PDB.

PDB entries 3J3Q and 3J3Y are models based on cryo-electron microscopy data and use of a molecular dynamics flexible-fitting method. They contain 1356 and 1176 protein chains, respectively, and over two million atoms each. The HIV-1 capsid is the protein envelope that encloses and protects the RNA genome of the virus. An important subject of study, the full capsid has been a difficult target for structural characterization due to its extremely large size and morphological variability.

In anticipation of greater numbers of PDB depositions, involving ever larger and more complex structures, often determined using multiple methods, the wwPDB has been developing a new system for deposition and annotation that will go into full production early in 2014. This system will support depositions of structures of any size, determined using diffraction, EM and/or NMR methods. Large structures that exceed the limitations of the PDB file format will be processed and released intact so that "split entries" become a thing of the past. Since the coordinate records in such large structures cannot be validly represented by the PDB file format, only an abbreviated PDB formatted file, containing authorship and citation details, will be provided in the FTP archive.

The wwPDB has also convened a Working Group for PDBx/mmCIF Data Deposition that includes representatives from the major X-ray structure-determination packages, and is chaired by Paul Adams (Lawrence Berkeley Laboratory; Phenix). To ease the transition from PDB to PDBx/mmCIF, the Working Group made recommendations about essential extensions required for large structures that have been incorporated. In addition, PDBx/mmCIF files suitable for deposition can now be created with recent versions of CCP4 (REFMAC 5.8) and Phenix (1.8.2) software packages. Both packages support the recommended extensions for large structures.

Prior to the release of the new deposition system in 2014, the wwPDB will accept, process, and then distribute the large files intact on the PDB FTP in the directory /pub/pdb/data/large_structures. In order to not break any current software, wwPDB curators will "split" such entries into a collection of PDB-format files that will be distributed on the PDB FTP following current release and formatting conventions. In 2014, all legacy split entries will be "reunited" and released intact (in PDBx/mmCIF format only) by the wwPDB.

The complete announcement on the deposition and release of large strucures is at wwpdb.org; users with questions about the new deposition system or the procedures for handling large structures should contact info@wwpdb.org.

10,000 NMR Entries

With the June 18, 2013 update, the number of entries released in the PDB archive determined using NMR spectroscopy passed the 10,000 mark.

Updated Version of MAXIT Available

MAXIT Version 8 includes a new option to assign ligands the same chain IDs as the adjacent polymers and incorporates several fixes to bugs. This command-line program can also be used to:

  • Read and write PDB and mmCIF format files, and translate between file formats
  • Perform consistency checks on coordinates, sequence, and crystal data
  • Automatically construct, transform, and merge information between formats
  • Align residue numbering in the coordinates with the sequence
  • Reorder and rename atoms in standard and nonstandard residues and ligands according to the wwPDB Chemical Component Dictionary
Visit sw-tools.rcsb.org to download MAXIT and other programs for processing and curating PDB data.


Deposition Statistics

In the second quarter of 2013, 2855 experimentally-determined structure coordinates and 138 3DEM maps were deposited to the archive. 5360 coordinate entries total have been deposited in the archive in 2013.

For the entries deposited this quarter, 84.1% were deposited with a release status of hold until publication; 13.2% were released as soon as annotation of the entry was complete; and 2.7% were held until a particular date. 91.9% of these entries were determined by X-ray crystallographic methods; 5.3% were determined by NMR methods.

During the same period, 2586 structures were released in the PDB.