Data Exploration Services

RCSB.org is visited by millions of users each year. Traffic is tracked using internally-developed tools and filtered to remove robotic access.


Month Unique Visitors Visits Bandwidth
January 2022 607,078 3,052,321 22.8TB
February 2022 582,930 2,931,195 23.43TB
March 2022 666,545 3,191,671 28.98TB
April 2022 624,354 2,987,213 18.59TB
May 2022 675,448 3,264,972 28.62TB
June 2022 621,028 3,007,941 29.69TB
July 2022 560,727 2,883,781 47.5 TB
August 2022 558,639 2,714,138 33.63 TB
September 2022 670,635 2,990,858 53.48 TB
October 2022 699,200 3,412,918 81.29 TB
November 2022 716,250 3,313,630 77.27 TB
December 2022 650,518 3,051,326 60.24 TB

The new 1D-3D Group Alignment Viewer supports exploration of multiple sequence alignments (MSA) at sequence and structure levels for PDB experimental structures and Computed Structure Models (CSMs). Select proteins and/or residue regions from the MSA to view their 3D structures aligned in Mol*. Options to display (or hide) other polymeric chains and ligands are available.

RCSB.org clusters protein entities (PDB experimental structures and CSMs) by sequence identity threshold and UniProt accession. For each cluster, the MSA is calculated using Clustal Omega and displayed in the 1D-3D Group Alignment Viewer using specific color schemes. PDB protein sequence positions are represented in blue if residue was experimentally determined, and in gray if not. CSMs are colored according to their local pLDDT scores.

To access this feature, select the 1D-3D Alignments option from a UniProt Group Summary or Sequence page (for an example, see the Group Summary for O95786). UniProt Group pages are available from Structure Summary pages (in the Entity Groups table) and from Search Results (more). Documentation is available.

1D-3D Group Alignment View for O95786.

1D-3D Group Alignment View for O95786.

 

RCSB.org uses a novel method to quickly detect similarity between protein shapes of any size.

Use the Advanced Search>Structure Similarity option to find proteins with similar 3D protein shapes using either a PDB ID or a URL to an external structure data file (mmCIF, BinaryCIF or PDB file format).

Enter a PDB ID or URL address for a publicly-available data file (e.g., from AlphaFold, ModelArchive, ESMFold, a shared drive) to launch a Structure Similarity Search. Launch an example search that uses a Web Link.

The system attempts to assess global 3D-shape similarity by using BioZernike descriptors that capture the global volumetric shape of the protein.  The search will return structures whose volumes are globally similar to the query structure provided.

Users can search for similar:

  1. polymeric chains to a given chain
  2. assemblies to a given assembly

Two modes of matching are available:

  1. Strict: returns matches that are all relevant but may not return more distant matches
  2. Relaxed: returns all similar matches, but may include some false positives

A Structure Match Score is provided in the search results when the option to return data as “Assemblies” is selected.
Advanced Search can be used to combine Structure Similarity searches with other types of queries. Visit the Advanced Search documentation to learn more.

Details about the Structure Similarity search and BioZernike descriptors have been published:
Real time structural search of the Protein Data Bank
Guzenko D, Burley SK, Duarte JM
(2020) PLoS Comput Biol 16(7): e1007970. doi:10.1371/journal.pcbi.1007970

RCSB PDB Team Photo

RCSB PDB team pictured at the January 2020 Cloud Technologies Best Practices hosted by the Institute for Quantitative Biomedicine at Rutgers, The State University of New Jersey.

Join RCSB PDB to design, develop, & deploy modern web and data applications & complex user interfaces. Help accelerate research and training in biology, medicine, & related disciplines. Positions are available at Rutgers, SDSC/UCSD, and UCSF.

For more, visit our Careers page or Contact Us with questions.