Data Deposition/Biocuration Services and
Archive Management

In the second quarter of 2024, 4,722 experimentally-determined structures were deposited to the PDB archive for a total of 9,317 entries deposited in the year. Data are processed by wwPDB partners RCSB PDB, PDBe, PDBj and PDBc.

Of the structures deposited in 2024 so far, 84.4% were deposited with a release status of hold until publication 10.0% were released as soon as annotation of the entry was complete and 5.5% were held until a particular date. 56.1% of these entries were determined by X-ray crystallographic methods 1.7% were determined by NMR methods and 42.1% by 3DEM.

During the same time quarter, 4,030 structures were released in the PDB, including 261 SARS-CoV-2 structures. 2,309 EMDB maps were released in the archive.

RCSB PDB Annotator Yuhe Liang

Yuhe Liang

Congratulations to RCSB PDB's Yuhe Liang on processing over 10,000 PDB depositions. He is the sixth biocurator to reach this milestone in the wwPDB.

Dr. Liang received his PhD in biophysics from Peking University, China with expertise in macromolecular crystallography and joined the PDB after his postdoctoral training on structural and functional studies of important proteins related to human health at University of Pittsburgh School of Medicine.

During his 10-year career at RCSB PDB, he has committed his extensive scientific expertise and profound data curation skills to providing excellent data curation services for the Protein Data Bank. His dedication and energy has significantly contributed to high quality data archive for the benefit and advancement of the scientific community. We congratulate Dr. Liang with this exciting accomplishment and look forward to his further career success.

PDBx/mmCIF ecosystem illustration from the new PDBx/mmCIF User Guide

Benefits of the PDBx/mmCIF ecosystem

wwPDB has launched a detailed PDBx/mmCIF File Format User Guide.

As the foundation for depositing, annotating, and archiving structural data across diverse experimental techniques, the Protein Data Bank Exchange macromolecular Crystallographic Information Framework (PDBx/mmCIF) stands as the master format of the Protein Data Bank. Our user-friendly guide offers detailed explanation and examples of essential PDBx/mmCIF records, aimed to facilitate a smooth transition to this format for depositors and users alike.

The wwPDB anticipates that all four-character PDB IDs will be exhausted by 2028, after which 12-character PDB IDs will be issued. Entries with extended PDB IDs will not be compatible with the legacy PDB file format and will only be available in PDBx/mmCIF format. wwPDB encourages users to transition to the PDBx/mmCIF format as soon as possible.

Users are asked to participate in a brief survey (accessible from the PDBx/mmCIF File Format User Guide) to share feedback on this guide by December 15, 2024 to help contribute to future developments.

The standardization of protein modification handling ensures that there is a single correct approach to handling each protein modification that occurs within the PDB archive. However, there are many existing PDB entries that contain protein modifications which do not follow these handling conventions.

As part of the protein modifications remediation project, all model coordinates files containing protein modifications are being re-released to add a new protein modification data category. This new category will list all observed PCMs/PTMs within the entry, as well as their type and category, allowing better findability.

A new category will also be added to the Chemical Component Definition (CCD) files. It will state whether the CCD is a known PCM, its type and category, as well as on which positions in the amino acid and in the polypeptide it is expected to be observed. If this PCM is also a known PTM, it will have the Uniprot generic PTM accession ID.

Finally, any protein modifications that are inconsistently handled within a PDB entry will be amended, to ensure that a given modification is consistently handled in the PDB archive.

Detailed information about this work is available from GitHub, including PDBx/mmCIF dictionary extension and a set of example files, and complete documentation of the additional annotation.

Questions or feedback? Contact deposit-help@mail.wwpdb.org.

The protein chemical modifications (PCMs) and post translational modifications (PTMs) remediation project is a wwPDB collaborative project carried out principally by PDBe at EMBL-EBI, and is funded by BBSRC grant number BB/V018779/1.

A paper co-authored by all organizers and participants in the EMDataResource Ligand Challenge has now been published in Nature Methods: Outcomes of the EMDataResource Cryo-EM Ligand Modeling Challenge.
The paper describes the results of the 2021 Challenge and recommends best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.

The 2021 EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein/nucleic-acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic resolution (1.9-2.5 Å). Three published maps were selected as targets: E. coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog, and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics, and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.

Outcomes of the EMDataResource cryo-EM Ligand Modeling Challenge
Lawson C, Kryshtafovych A, Pintilie G, Burley S, Cerny J, Chen V, Emsley P, Gobbi A, Joachimiak A, Noreng S, Prisant M, Read R, Richardson J, Rohou A, Schneider B, Sellers B, Chao C, Sourial E, Williams C, Williams C, Yang Y, Abbaraju V, Afonine PV, Baker M, Bond P, Blundell T, Burnley T, Campbell A, Cao R, Cheng J, Chojnowski G, Cowtan K, Dimaio F, Esmaeeli R, Giri N, Grubmüller H, Hoh SW, Hou J, Hryc C, Hunte C, Igaev M, Joseph A, Kao W, Kihara D, Kumar D, Lang L, Lin S, Subramaniya SRMV, Mittal S, Mondal A, Moriarty N, Muenks A, Murshudov G, Nicholls R, Olek M, Palmer C, Perez A, Pohjolainen E, Pothula K, Rowley C, Sarkar D, Schäfer L, Schlicksup C, Schroeder G, Shekhar M, Si D, Singharoy A, Sobolev O, Terashi G, Vaiana A, Vedithi S, Verburgt J, Wang X, Warshamanage R, Winn M, Weyand S, Yamashita K, Zhao M, Schmid M, Berman H, Chiu W.

(2024) Nature Methods doi: 10.1038/s41592-024-02321-7

Decorative Illustration

Restraint Validation of Biomolecular Structures Determined by NMR in the Protein Data Bank
Kumaran Baskaran, Eliza Ploskon, Roberto Tejero, Masashi Yokochi, Deborah Harrus, Yuhe Liang, Ezra Peisach, Irina Persikova, Theresa A Ramelot, Monica Sekharan, James Tolchard, John D Westbrook, Benjamin Bardiaux, Charles Schwieters, Ardan Patwardhan, Sameer Velankar, Stephen K Burley, Genji Kurisu, Jeffrey C Hoch, Gaetano T Montelione, Geerten W Vuister, Jasmine Y Young
(2024) Structure 32, 1–14: doi:10.1016/j.str.2024.02.011

NextGen Archive: Centralising Access to Integrated Annotations and Enriched Structural Information by the Worldwide Protein Data Bank
Preeti Choudhary, Zukang Feng, John Berrisford, Henry Chao, Yasuyo Ikegawa, Ezra Peisach, Dennis W. Piehl, James Smith, Ahsan Tanweer, Mihaly Varadi, John D. Westbrook, Jasmine Y. Young, Ardan Patwardhan, Kyle L. Morris, Jeffrey C. Hoch, Genji Kurisu, Sameer Velankar, Stephen K. Burley
(2024) Database 2024: baae041 doi: 10.1093/database/baae041

Decorative Illustration

wwPDB Foundation Chair Celia Schiffer

Celia A. Schiffer, Chair of the wwPDB Foundation and Professor & Chair of Biochemistry & Molecular Biotechnology at the UMass Chan Medical School, has been elected to the National Academy of Sciences (USA).

The Schiffer lab primarily studies the molecular basis for drug resistance in viruses. Through this research, she has developed a new paradigm for avoiding drug resistance in structure-based drug design that translates to other diseases. Her accomplishments in biomedical research have been widely honored. In 2021, she was named the chair of the Department of Biochemistry & Molecular Technology at UMass Chan. She is a fellow of the American Academy of Microbiology and in 2019 was invested as the Gladys Smith Martin Chair in Oncology. In 2020, she was recognized with the William C. Rose Award from the American Society of Biochemistry and Molecular Biology; in 2016 she was named by the Massachusetts Society for Medical Research as educator of the year for excellence in research, mentoring and leadership in bringing women and underrepresented minorities into science; and in 2016 she received the inaugural Chancellor’s Award for Excellence in Mentoring from Chancellor Michael F. Collins.

The wwPDB Foundation was established in 2010 to raise funds in support of the outreach activities of the wwPDB. The Foundation raised funds to help support PDB50 events, workshops, and educational publications. The Foundation is chartered as a 501(c)(3) entity exclusively for scientific, literary, charitable, and educational purposes.