Data Deposition/Biocuration Services and Archive Management

In the third quarter of 2020, 3900 experimentally-determined structures were deposited to the archive for a total of 11896 entries deposited in the year. Data are processed by wwPDB partners RCSB PDB, PDBe, and PDBj.

Of the structures deposited in 2020 so far, 82.6% were deposited with a release status of hold until publication; 12.2% were released as soon as annotation of the entry was complete; and 5.2% were held until a particular date. 79.4% of these entries were determined by X-ray crystallographic methods; 2.7% by NMR methods; and 17.7% by 3DEM.

During the same period, 3804 structures and 1470 EMDB maps were released in the PDB.

All validation reports for released PDB entries have been updated with recalculated percentiles. In addition to recently introduced carbohydrate section and 2D Symbol Nomenclature For Glycan (SNFG) images for oligosaccharides from the carbohydrate remediation project, these reports now incorporate visualization of ligand validation and model fit to electron density maps for X-ray ligands. These include 2-dimensional diagrams of ligands, highlighting geometric validation criteria and, for structures determined by crystallography, 3-dimensional views of electron density.

In addition, EM map analysis, and the fit of EM model to its map volume. FSC curves are also included to compare reported and estimated resolution, where either half maps or FSC data was uploaded.

These updated wwPDB validation reports provide an assessment of structure quality using widely accepted standards and criteria, recommended by community experts serving in Validation Task Forces.

Further information and sample validation reports are available at wwpdb.org.

A new data representation for carbohydrates in PDB entries and reference data improves the Findability and Interoperability of these molecules in macromolecular structures. The PDB archive now reflects:

  • Standardized Chemical Component Dictionary nomenclature following IUPAC-IUBMB recommendations
  • Uniform representation for oligosaccharides
  • Adoption of glycoscience-community commonly used linear descriptors using community tools
  • Annotated glycosylation sites in PDB structures

The wwPDB has created a new ‘branched’ entity representation for polysaccharides, describing all the individual monosaccharide components of these in the PDB entry. As part of this process, we have standardized atom nomenclature of >1,000 monosaccharides in the Chemical Component Dictionary (CCD) and applied a branched entity representation to oligosaccharides (>8,000 PDB entries). To guarantee unambiguous chemical description of oligosaccharides in the affected PDB entries, an explicit description of covalent linkage information between their monosaccharide units is included. In addition, wwPDB validation reports provide consistent representation for these oligosaccharides and include 2D representations based on the Symbol Nomenclature for Glycans (SNFG).

To support the remediation of carbohydrate representation, software tools providing linear descriptors were developed in collaboration with the glycoscience community to enable easy translation of PDB data to other representations commonly used by glycobiologists. These include Condense IUPAC from GMML at University of Georgia, WURCS from PDB2Glycan at The Noguchi Institute, Japan, and LINUCS from pdb-care at Germany.

wwPDB has also used this opportunity to improve the organization of chemical synonyms in the CCD by introducing a new _pdbx_chem_comp_synonyms data category. This will enable more comprehensive capture of alternative names for small molecules in the PDB. To minimize disruption to users, the legacy data item, _chem_comp.pdbx_synonyms, will be retained for a transition period through 2021.

The carbohydrate remediation project is a wwPDB collaborative project that is carried out principally by RCSB PDB at Rutgers, The State University of New Jersey and is funded by NIH Common Fund Glycoscience Program through the National Cancer Institute cooperative agreement U01 CA221216 to Dr. Robert Woods at the Complex Carbohydrate Research Center at the University of Georgia in collaboration with Dr. Jasmine Young as sub-awardee at RCSB PDB at Rutgers.

Detailed information about this project, including a list of remediated PDB entries, is at wwPDB.org.