Newsletter | Fall 2014 ⋅ Number 63

Data Deposition and Annotation

In the third quarter of 2014, 2645 experimentally-determined structure coordinate entries were deposited to the archive. During the same period, 2406 structures and 150 EMDB maps were released in the PDB.

A total of 7791 entries have been deposited in 2014.

Of the structures deposited during this year, 85.9% were deposited with a release status of hold until publication; 11.1% were released as soon as annotation of the entry was complete; and 3.0% were held until a particular date. 92.1% of these entries were determined by X-ray crystallographic methods; 5.5% were determined by NMR methods.

Depositing large structures?

If your structure exceeds the size of the traditional PDB file format limits (>62 chains or >99,999 atoms), please prepare your deposition as a single PDBx/mmCIF file using recent versions of CCP4 (REFMAC 5.8) and Phenix (1.8.2) before submitting your structure with the wwPDB Deposition system.

The wwPDB recently combined entries that represent large structures (such as ribosomes) across multiple PDB files (SPLIT entries) into single files. These combined structures have been issued new PDB IDs and are represented in the archive in both PDBx/mmCIF and PDBML formats. As announced previously, these files were made available in a separate FTP directory for public testing.

These combined entries will be fully integrated into the main PDB FTP archive on December 10, 2014 as part of the regular weekly update. The corresponding "SPLIT" entries will be removed (obsoleted). Users searching for ID codes of "SPLIT" entries at wwPDB member websites will be automatically redirected to the combined entry.

As part of the same update, a separate directory in the PDB FTP archive will contain a tar file including a collection of "best-effort," limited PDB-format files for large structures that contain authorship, citation details and coordinate data and a mapping file that contains the mapping between the chains present in the the large entry and the chains present in the limited PDB-format files.

After December 10, 2014, large structures (>62 chains and/or 99999 ATOM lines) will only be distributed in the main PDB FTP directory in PDBx/mmCIF and PDBML formats. Structures that do not exceed the limitations of the PDB format will continue to be provided as PDB files in the archive for the foreseeable future.

For more information, visit wwpdb.org.

RCSB PDB's John Westbrook presented New wwPDB Deposition and Annotation System at the International Union of Crystallography's Congress and General Assembly in Montreal City, Canada, in August.

The wwPDB's new system is open for X-ray crystallographic structure depositions at deposit.wwpdb.org/deposition. Since the beginning of 2014, more than 2400 structures have been submitted and more than 800 have been released!

Features of the new system include:

  • New data format: Extensible, dictionary-driven PDBx/mmCIF produces more uniform data across the archive
  • Deposition efficiency: ability to replace coordinate and/or experimental data files pre- and post- submission
  • Enhanced communication: "In context" correspondence with wwPDB annotators and ability to preview and download processed files post-submission
  • Improved annotation: Improved ligand chemistry and polymer sequence checks with visual inspection provided during deposition
  • Validation: Geometry and experimental data checking during deposition and annotation is based on recommendations from expert task forces (X-ray, EM, NMR)