RCSB PDB Newsletter

Data Query, Reporting and Access

Website Statistics

Access statistics for the second quarter of 2012 are shown.

Month	Unique Visitors	Visits	Bandwidth
April	258130	617748	778.90 GB
May	244865	615509	794.33 GB
June	206864	525815	624.38 GB

Users can tour the PDB archive by drilling down on significant properties of structures such as Organism and Polymer Type with just a few clicks. This example shows the path to the EC distribution of structures from humans. Clicking on any link returns the structures that match all selected parameters. This feature is available to navigate through all search results and for the entire PDB archive.

Tour the PDB with Drill-down Pie Charts

Standard characteristics of PDB entries−resolution, release date, experimental method, polymer type, organism, taxonomy−are used to create searchable data distribution summaries.

The Explore Archive widget on the home page provides a quick statistical overview of the PDB. Browse the charts individually, or view them all together by clicking on the "Show all" link. Clicking on a pie chart image will display a more detailed graphic that lists the percentages for the categories shown. Selecting one of the listed results will launch the corresponding structures in the Query Results Browser.

Data Distributions also appear at the top of the Query Results Browser, and can be used to view a quick statistical overview and to refine the results into subsets of interest. For example, users can "drill down" through these faceted search options to quickly access high resolution entries from a structure type search; human-related entries from a sequence search; or most recent entries returned from a chemical component search for a particular ligand. Any combination of categories is possible.

These charts can be hidden from the query results for users who want to only view the individual entries.

Data distribution summaries can also be used to explore the latest weekly update of PDB entries.

Domain-based Structural Alignments

The sequence diagram for PDB ID 3BMV shows the corresponding UniProtKB sequence, the SEQRES and ATOM records, and the various annotations that are available.

In the example of the 3D Similarity tab for 3BMV, selecting view for the 7th-ranked domain PDP:3DHUAa launches the structure alignment view for the alpha amylase domains of 3BMV and 3DHU.4 Selecting the link for PDP:3DHUAa under the column Domain 2 returns the 3D Similarity tab for entry 3DHU.

To provide more accurate results, the latest version of the 3D Similarity tab uses domain-based protein structure alignments instead of chain-based alignments.

For an example, see the 3D Similarity tab for glucanotransferase (PDB ID 3BMV).

An image of the sequence highlights how the residues listed in the sequence (SEQRES) and in the atom records (ATOM) map onto the relevant parts of the UniProtKB sequence, along with annotations from DSSP, SCOP, PDP and Pfam.

Domains are also highlighted in a table that displays the most important calculated results and scores. Domains can be selected from the pull-down menu above the table, or by clicking on a domain in the sequence image.

The table can be sorted and filtered, and offers links to a 3D structure alignment in Jmol (from the results column) and to information about similar domains. In the example of the 3D Similarity tab for 3BMV, selecting "view" for the domain PDP:3DHUAa launches the structure alignment view for the alpha amylase domains of 3BMV and 3DHU. Selecting the link for PDP:3DHUAa under the column Domain 2 returns the 3D Similarity tab for entry 3DHU.

The calculation of the domain-split representative is an extension of our sequence clustering approach. To remove redundancy, we start with a 40% sequence identity clustering procedure, and select a representative chain from each sequence cluster. If the representative chain contains multiple domains, each is included. SCOP 1.75 domain assignments¹ are used when available. Otherwise, the assignments are computed using ProteinDomainParser.²

Bookmark and Post Webpages

Users can easily send and store RCSB PDB web pages using the "Share this Page" button. With this service, favorite PDB entries, Molecule of the Month articles, and other features can be easily emailed to colleagues or posted to Facebook, Twitter, and LinkedIn.

Customized Home Page: Sequence Search

The Search Sequence widget. Enter a PDB ID to select a chain, or enter a sequence.

The RCSB PDB homepage is comprised of web widgets that can be moved around, minimized, or hidden so users can create a website that reflects their interests. Frequently-used features can be moved to the top, while less popular items can be hidden or collapsed.

The Sequence Search widget can search for a given sequence or a particular chain of any PDB entry using BLAST, FASTA, or PSI-BLAST. Options include specifying the E cut-off value and filtering low complexity.

To add this widget to your home page, select the Customize This Page button from the left menu. Download Files, Structure Comparison, and ADIT Deposition widgets can also be added.

Create a Collage of Structures

Image collage of virus structures.

Different types of reports can be generated for a set of PDB entries, including the option to generate a collage of molecular images.

To create a collage, search for a group of structures, pull down the Generate Reports menu and select Custom Report>Image Collage. The query results will appear as a series of tiled molecular pictures.

Mousing over each image in the collage displays the structure title; clicking on the small image shows a larger version. The PDB ID listed links to the corresponding Structure Summary page.

Image collages can be customized by the size of the images displayed and how many images are shown per page.

Other Generate Reports options include creating customized tables or viewing pre-generated summary reports about structure, sequence, ligand, Structural Genomics Center, primary citation, and biological details.

A. G. Murzin, S. E. Brenner, T. Hubbard, C. Chothia (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 247: 536-540.
N. Alexandrov, I. Shindyalov (2003) PDP: protein domain parser. Bioinformatics 19: 429-430.

NEWSLETTER