Data Deposition/Biocuration Services and
Archive Management

In the first quarter of 2023, 3,896 experimentally-determined structures were deposited to the PDB archive for a total of 3,896 entries deposited in the year.  Data are processed by wwPDB partners RCSB PDB, PDBe, PDBj and PDBc.

Of the structures deposited in 2023 so far, 86.1% were deposited with a release status of hold until publication,  7.1% were released as soon as annotation of the entry was complete, and 6.8% were held until a particular date. 62.8% of these entries were determined by X-ray crystallographic methods.  1.9% were determined by NMR methods  and 35.0% by 3DEM.

During the same time quarter, 3,399 structures were released in the PDB, including 138 SARS-CoV-2 structures. 1,613 EMDB maps were released in the archive.

pdb_extract logo

pdb_extract merges coordinate data, author-provided metadata, and data processing information from output files produced by structure determination programs into a complete PDBx/mmCIF file that can used for easy deposition with OneDep. Use the pdb_extract online form or the easily-installed command line interface that been re-engineered (Python).

Coordinate Data

Uploaded coordinate files (PDBx/mmCIF or PDB) will be checked against the PDBx/mmCIF dictionary. Legacy PDB formatted files will be converted to a OneDep-compliant PDBx/mmCIF data file.

Metadata

Depositors are encouraged to use the PDBj CIF editor to easily edit a template file to include corresponding metadata (sequence, crystallization condition, etc.). Method-specific templates have been pre-loaded into the PDBj CIF editor: X-ray, 3DEM, and NMR. Click on the top-left menu (light gray widget icon) to save the edited metadata file in PDBx/mmCIF. Upload this completed file in pdb_extract to prepare single or multiple related structures for submission.

Structure Determination Output Files

Upload the log file produced during data processing, and pdb_extract will parse the related diffraction metadata. Log files from various standalone packages and from CCP4 and autoPROC pipelines are supported, including:

  1. Aimless
  2. DIALS
  3. d*TREK
  4. HKL-2000
  5. HKL-3000
  6. Pointless
  7. Scala
  8. Scalepack
  9. XDS
  10. Xia2
  11. Xscale
Annual Report 2022 Cover

RCSB PDB has recently introduced DNS names for programmatic access to PDB archive downloads:

  1. FTP: ftp://ftp.rcsb.org
  2. HTTPS: https://files.wwpdb.org (replaces https://ftp.rcsb.org)
  3. RSYNC: rsync://rsync.rcsb.org (replaces rsync://rcsb.org)

The File Download Services documentation has detailed information.

Starting September 2023, RCSB PDB will start enforcing use of these updated DNS names. URLs in which the DNS name doesn’t match the protocol (e.g., https://ftp.rcsb.org, ftp://files.rcsb.org) will no longer work at that time.

Users who download PDB archive data programmatically are encouraged to switch to the new DNS names as soon as possible. HTTPS protocol is preferred (over FTP) for individual file downloads.

Please contact info@rcsb.org with any questions.