Data Deposition/Biocuration Services and
Archive Management

In 2024, 19,322 experimentally-determined structures were deposited to the archive. Data are processed by wwPDB partners RCSB PDB, PDBe, and PDBj.

Of all structures deposited this year, 84.1% were deposited with a release status of hold until publication; 9.6% were released as soon as annotation of the entry was complete; and 6.3% were held until a particular date. 56.3% of these entries were determined by X-ray crystallographic methods; 1.6% were determined by NMR methods; and 42% by 3DEM.

9,334 EMDB map entries were released.

15,319 new PDB structures were released in 2024. They account for 6.7% of the year-end total holdings of 229,382 available entries.

RCSB PDB Annotator Irina Persikova

PDB Chemical Component Dictionary (CCD) files with ideal geometries are now provided in SDF/MOL format at the PDB archive. Users can download individual CCD files in SDF/MOL format from the PDB archive at https://files.wwpdb.org/pub/pdb/refdata/chem_comp/.

In addition, a concatenated CCD file in SDF/MOL format is provided at https://files.wwpdb.org/pub/pdb/data/monomers/components-pub.sdf.gz and accessible from the wwPDB website at http://www.wwpdb.org/data/ccd.

 

Integrative structures available in the PDB archive

Preprint publications will trigger wwPDB data archive release

The wwPDB releases entries under the following circumstances:

  • upon author’s request
  • upon publication
  • at the end of the hold period defined during submission (up to 1 year) if no publication is available by that time

Contributions to public preprint archives that reference PDB, EMDB, or BMRB entry IDs are considered publications by the wwPDB and will therefore trigger release. For example, a PDB structure on hold for publication (status HPUB/HOLD) will be scheduled for release if the wwPDB finds a bioRxiv preprint with matching authors, title, and an entry ID code.
Publication dates and citation details are obtained through a combination of direct communications from authors, journals, and members of the scientific community (communicated via OneDep or deposit-help@mail.wwpdb.org) and PubMed searches (automated comparison of title and author lists included with the deposition and manual review for PDB, EMDB, and BMRB IDs).
Any concerns regarding this policy should be directed to the wwPDB leadership via deposit-help@mail.wwpdb.org.

The PDB archive now includes annotation of protein chemical modifications (PCMs) and post-translational modifications (PTMs) in a standardized way. As previously announced, the PDBx/mmCIF dictionary has been extended to enhance PCM and PTM annotation.
In Chemical Component Definition (CCD) files:

  1. A new item in the chem_comp category: chem_comp.pdbx_pcm, stating whether the CCD is a known PCM/PTM
  2. A new category called pdbx_chem_comp_pcm, stating the PCM/PTM type and category, as well as on which positions in the amino acid and in the polypeptide it is expected to be observed. If this PCM is also a known PTM, it will have the Uniprot PTM accession ID

In atomic coordinate files:

  1. A new item in the pdbx_entry_details category: pdbx_entry_details.has_protein_modification, stating if the entry contains a PCM/PTM.
  2. A new category called pdbx_modification_feature, providing an instance-level annotation of all observed PCMs/PTMs within the entry, as well as their type and category.

In addition, any protein modifications inconsistently handled within PDB entries are amended to ensure consistency in the PDB archive. This includes a major clean-up of incorrect link records (struct_conn). All entries containing protein modifications are being re-released gradually throughout Spring 2025.
This standardization ensures that there is a single approach to handling each protein modification that occurs within the PDB archive, allowing better findability.
The protein chemical modifications (PCMs) and post translational modifications (PTMs) remediation project is a wwPDB collaborative project carried out principally by PDBe at EMBL-EBI, and is funded by BBSRC grant number BB/V018779/1.

Integrative structures available in the PDB archive

Integrative structures are available at wwPDB.org and the PDB archive

Structures determined by integrative and hybrid structure determination methods (IHM) are now available at wwPDB DOI landing pages for both released and on-hold entries, along with >225,000 experimental structures in the PDB archive. These pages present basic information about the corresponding IHM structure, offer download of model coordinates and validation files from the PDB archive (https://files.wwpdb.org/pub/pdb_ihm/), and provide a link to the PDB-Dev resource that currently serves more detailed information about IHM structures, including the newly available links to PDB DOIs.

For an example, visit the DOI landing page for a recently-released IHM entry in the PDB archive via PDB DOI: https://doi.org/10.2210/pdb9a8n/pdb.

PDB DOIs issued for each IHM or PDB entry are linked from the online versions of papers where PDB IDs are mentioned. Users can distinguish IHM structures from PDB experimental structures on the DOI landing page where IHM structures have “integrative” as structure determination method displayed.

 

Integrative structures available in the PDB archive

IHMCIF is a data information framework that supports archiving and disseminating macromolecular structures determined by integrative or hybrid modeling (IHM), and making them Findable, Accessible, Interoperable, and Reusable (FAIR). IHMCIF is an extension of the Protein Data Bank Exchange/macromolecular Crystallographic Information Framework (PDBx/mmCIF) that serves as the framework for the Protein Data Bank (PDB) to archive experimentally determined atomic structures of biological macromolecules and their complexes with one another and small molecule ligands (e.g., enzyme cofactors and drugs).

Ultimately, IHMCIF will facilitate the unification of PDB-Dev data and tools with the PDB archive so that integrative structures can be archived and disseminated through PDB.

IHMCIF: An Extension of the PDBx/mmCIF Data Standard for Integrative Structure Determination Methods
Brinda Vallat, Benjamin M. Webb, John D. Westbrook, Thomas D. Goddard, Christian A. Hanke, Andrea Graziadei, Ezra Peisach, Arthur Zalevsky, Jared Sagendorf, Hongsuda Tangmunarunkit, Serban Voinea, Monica Sekharan, Jian Yu, Alexander A.M.J.J. Bonvin, Frank DiMaio, Gerhard Hummer, Jens Meiler, Emad Tajkhorshid, Thomas E. Ferrin, Catherine L. Lawson, Alexander Leitner, Juri Rappsilber, Claus A.M. Seidel, Cy M. Jeffries, Stephen K. Burley, Jeffrey C. Hoch, Genji Kurisu, Kyle Morris, Ardan Patwardhan, Sameer Velankar, Torsten Schwede, Jill Trewhella, Carl Kesselman, Helen M. Berman, Andrej Sali
(2024) Journal of Molecular Biology 436: 168546 doi: 10.1016/j.jmb.2024.168546