Data Exploration Services

RCSB PDB's website at rcsb.org was visited each month by an average of 448,925 unique visitors and >1 million unique visits. More than 37 TB of data were accessed. Traffic is tracked using AWstats.  RCSB.org is visited by >1 million unique visitors/year.

Month Unique Visitors Visits Bandwidth
January 2017 351,773 763,080 1867.92 GB
February 2017 360,900 772,850 2454.42 GB
March 2017 418,335 906,522 2906.16 GB
April 2017 575,554 1,144,078 2597.41 GB
May 2017 543,609 1,120,077 2289.27 GB
June 2017 419,900 969,889 2058.14 GB
July 2017 391,988 1,120,176 2441.16 GB
August 2017 364,457 998,198 4621.23 GB
September 2017 440,210 1,199,015 4957.21 GB
October 2017 486,342 1,274,109 3707.07 GB
November 2017 483,913 1,262,394 4050.32 GB
December 2017 408,597 1,107,545 3528.30 GB

Simple text searches at rcsb.org are now easier and more accurate. Text searching from the top query bar uses the combined power of the open source Apache Solr platform and an indexing of PDBx/mmCIF data.

Access this new functionality by entering a search term or terms in the top bar of any RCSB PDB page and hitting ‘Go’ or a keyboard return. Searches for multiple words (for example, insulin receptor) and queries for adjacent words enclosed in double quotation marks (for example, “insulin receptor”) may return different results. The first search finds results where the words appear anywhere in the entry, whereas the second returns results where the search terms appear exactly as ordered.

Search results are assigned “Match Scores” to help indicate the relevance of the result and to sort structures from “Higher to Lower” matches and vice versa. The figure below shows a search for the name Perutz.

Search terms that appear in certain PDBx/mmCIF tokens are highlighted (e.g., structure or citation author, citation, entity name or description, keywords, structure title).

Additional information is available at RCSB.org.

RCSB PDB is looking for Scientific Software Developers and Postdoctoral Fellows in to join the Development Team at UC San Diego.

We are looking for talented and highly motivated Scientific Software Developers and Postdoctoral Fellows to join our multidisciplinary development team.

The Challenge: Develop innovative analysis, integration, query, and visualization tools for 3D biomolecular structures to help accelerate research and training in biology, medicine, and related disciplines. In these projects, we employ the latest advances in computer science to develop highly interactive features and scalable services and workflows. This is a unique opportunity to engage in leading edge research, development, and outreach activities of the RCSB PDB with worldwide impact. For more, visit our Careers page or Contact Us with questions.

A snapshot of the PDB archive (ftp://ftp.wwpdb.org) as of January 1, 2018 has been added to ftp://snapshots.wwpdb.org/ and ftp://snapshots.pdbj.org/. Snapshots have been archived annually since January 2005 to provide readily identifiable data sets for research on the PDB archive.

The directory 20180101 includes the 136,472 experimentally-determined coordinate files and related experimental data available at that time. Atomic Coordinate and related metadata are available in PDBx/mmCIF, PDB, and XML file formats. The date and time stamp of each file indicates the last time the file was modified. The snapshot is 1034 GB.

  • The RCSB protein data bank: integrative view of protein, gene and 3D structural information (2017) Nucleic Acids Research 45: D271-D281 doi: 10.1093/nar/gkw1000
  • Impact of genetic variation on three dimensional structure and function of proteins (2017) PLOS ONE 12(3): e0171355 doi: 10.1371/journal.pone.0171355
  • Towards an efficient compression of 3D coordinates of macromolecular structures (2017) PLOS ONE 12(3): e0174846 doi: 10.1371/journal.pone.0174846
  • MMTF - an efficient file format for the transmission, visualization, and analysis of macromolecular structures (2017) bioRxiv doi: 10.1101/122689
  • BioJava-ModFinder: Identification of protein modifications in 3D structures from the Protein Data Bank (2017) Bioinformatics 33(13):2047-2049 doi: 10.1093/bioinformatics/btx101
  • MMTF - an efficient file format for the transmission, visualization, and analysis of macromolecular structures (2017) PLoS Comput Biol 13(6): e1005575. doi: 10.1371/journal.pcbi.1005575
  • NGLview - Interactive molecular graphics for Jupyter notebooks (2017) Bioinformatics btx789 doi: 10.1093/bioinformatics/btx789
  • MDsrv: viewing and sharing molecular dynamics simulations on the web (2017) Nat Methods 14:1123-1124 doi: 10.1038/nmeth.4497
  • Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework (2017) Genome Med 9:113 doi: 10.1186/s13073-017-0509-y