PDB-news-logo
Published quarterly by the Research Collaboratory
for Structural Bioinformatics Protein Data Bank
Spring 2013
Number 57

NEWSLETTER

Data Query, Reporting and Access

Website Statistics

Access statistics for the second quarter of 2013 are shown.

Month
Unique Visitors
Visits
Bandwidth
April
369,516
811,545
1701.94 GB
May
337,920
765,033
1669.93 GB
June
291,409
682,857
1387.67 GB


Beta Test RCSB PDB Mobile for Android

RCSB PDB Mobile for Android is in beta testing. The iOS version of RCSB PDB Mobile has been downloaded over 10,000 times.RCSB PDB users with Android phones or tablets using version 2.3 or higher of the Android OS can download beta version #2 of RCSB PDB Mobile from www.rcsb.org/pdb/static.do?p=mobile/RCSBapp.html.

This free app provides fast, on-the-go access to the RCSB PDB. Options are available to search the entire PDB database, view the latest protein and nucleic acid structures released, access MyPDB, view the entire catalog of Molecule of the Month articles, and more. Structures can be interactively explored using the molecular viewer NDKMol (courtesy of Dr. Takanori Nakane, Kyoto University).

The Android edition of RCSB PDB Mobile is still under development. This beta release will "time out" in 2 months, by which time we anticipate the next generation app will be available from the Google Play Store.

Please send any feedback to quinn@sdsc.edu.


Personalize the RCSB PDB with MyPDB

Saved Query Manager Features. A) After saving a search to MyPDB, name it by clicking on the words 'User Query' providing a more descriptive label. B) The manager can be used to change email notification settings, run queries, and delete saved searches. Access the MyPDB widget in the left-hand menu to create an account and then log in to access stored setting and functionalities:

  • Saved Query Manager. Use MyPDB to store any type of structure search, such as keyword, sequence, ligand, and any composite query built with Advanced Search. These queries can be run at any time with the click of a button.

    Stored searches can also be set to run with each update. Email alerts (weekly or monthly) will be sent when new PDB entries matching the search are released.

  • Personal Annotations. Users can save personal annotations and notes on the Structure Summary tab of any PDB entry, and can add structures to a "favorites list". The Personal Annotations summary page provides easy access to all of these tagged structures and annotations.

  • User Account. Personal information (name, email address, account password, country, user type) can be updated at any time. All MyPDB account information is kept private and secure.


Browse the Anatomical Therapeutic Chemical Classification System to Find Structures

Use the Anatomical Therapeutic Chemical Classification Browser to find PDB structuresRCSB PDB's Browse Database feature explores the PDB archive using different hierarchical trees. The WHO Collaborating Centre (www.whocc.no) for Drug Statistics Methodology's Anatomical Therapeutic Chemical (ATC) classification system (www.whocc.no/atc/structure_and_principles) organizes drugs into five levels according to the organ or system on which they act and/or their therapeutic and chemical characteristics. The RCSB PDB database can be browsed using the ATC system.

Select the ATC tab from the Browse Database interface to navigate through the drug classification hierarchy, view the number of associated PDB structures, and access the related entries.

The browser opens up the top level in the hierarchy. Clicking on the arrow/folder icons expands the respective nodes. Clicking on the name of the node will retrieve all PDB IDs associated with that ATC code. Structures having a particular ATC code (e.g., A12CC05) or name (e.g., Atorvastatin) can be found using the browser search box.

Other Browse options include GO Terms, Enzyme Classification, Transporter Classification, Source Organism, Genome Location, MeSH terms, SCOP, and CATH.


Map PDB Structures to Full-length Protein Sequences

Protein Feature View from the Structure Summary page for 2xv3 The Protein Feature View visually summarizes how a full-length protein sequence from UniProtKB 1 corresponds to PDB entries. It also loads annotations from external databases (such as Pfam)2 and homology models from the Protein Model Portal.3 Annotations visualizing predicted regions of protein disorder (computed with JRONN)4 and hydrophobic regions (as computed using a sliding window approach) are also displayed.

For individual entries, the Protein Feature View is available from the Molecular Description Widget on Structure Summary pages. The example shown for PDB ID 2vx35 illustrates how the ranges of a protein that have been observed in an experiment (in blue) correspond to the full length UniProtKB sequence (in grey). The secondary structure information from the PDB entry is also shown (helices in red, beta strands in yellow). Various features that are known for the UniProtKB sequence are displayed in green as they correlate with regions in the PDB entry. Protein modifications and active site residues (from UniProtKB and the PDB entry) are also annotated. Moving the mouse cursor over the "lollipops" displays the residue label. Mousing over the images shown in the "Secstruc" row reveals secondary structure information from the entry.

This view can be expanded to map all PDB entries related to a single UniProtKB sequence by selecting the Protein Feature View link shown in this widget. By default, a few representative PDB entries are used to give an overview for which regions of the UniProtKB sequence PDB entries are available. Selecting the plus sign or the "Show All" button will expand the view to show all related PDB chains, which can then be sorted by resolution, length, and release date. These Protein View images can be exported as Scalable Vector Graphics (SVG) files.

The PDB to UniProtKB mapping is based on the data provided by the Structure integration with function, taxonomy and sequence (SIFTS; pdbe.org/sifts) initiative.5


Search the PDB Using Drill-down Pie Charts

Standard characteristics of PDB entries–organism, taxonomy, experimental method, X–ray resolution, release date, polymer type, EC, SCOP classification, protein symmetry, and protein stoichiometry–are used to create searchable data distribution summaries.

The Explore Archive widget on the home page provides a quick statistical overview of the PDB. Browse the charts individually, or view them all together by clicking on the "Show al" link. Clicking on a pie chart image will display a more detailed graphic that lists the percentages for the categories shown. Selecting one of the listed results will launch the corresponding structures in the Query Results Browser.

Data distribution drill-downs can also be used to refine search results and to explore the latest weekly update of PDB entries.

Any combination of categories is possible. Users can drill-down to quickly access high resolution entries from a structure type search; human-related entries from a sequence search; or the most recent PDB entries containing a particular ligand.

These charts can be hidden from the query results for users who want to only view the individual entries.


Web Services for Accessing PDB Data

Web Services can help software developers build tools that efficiently interact with PDB data. Instead of storing coordinate files and related data locally, Web Services let software tools access the RCSB PDB remotely. Detailed documentation for accessing these Web Services is available at www.rcsb.org/pdb/software/rest.do.

RESTful services exchange XML files in response to URL requests. RESTful search services return a list of IDs for Advanced Search and SMILES-based queries. RESTful fetch services return data when given IDs, including PDB entity descriptions, ligand information, third-party annotations for protein chains, and PDB to UniProtKB mappings.

Please let us know at info@rcsb.org if there are website options that you think should be offered as a web service.


References

  1. UniProt Consortium. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40: D71-75.
  2. M. Punta, P. C. Coggill, R. Y. Eberhardt, J. Mistry, J. Tate, C. Boursnell, N. Pang, K. Forslund, G. Ceric, J. Clements, A. Heger, L. Holm, E. L. Sonnhammer, S. R. Eddy, A. Bateman, R. D. Finn. (2012) The Pfam protein families database. Nucleic Acids Res 40: D290-301.
  3. K. Arnold, F. Kiefer, J. Kopp, J. N. Battey, M. Podvinec, J. D. Westbrook, H. M. Berman, L. Bordoli, T. Schwede. (2009) The Protein Model Portal. Journal of structural and functional genomics 10: 1-8.
  4. Z. R. Yang, R. Thomson, P. McNeil, R. M. Esnouf. (2005) RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21: 3369-3376.
  5. S. Velankar, J. M. Dana, J. Jacobsen, G. van Ginkel, P. J. Gane, J. Luo, T. J. Oldfield, C. O'Donovan, M. J. Martin, G. J. Kleywegt. (2013) SIFTS: Structure Integration with Function, Taxonomy and Sequences resource. Nucleic Acids Res 41: D483-489.