Newsletter | Spring 2015 ⋅ Number 65

Data Query, Reporting, and Access

RCSB PDB is looking for Web Developers at the UC San Diego site. Descriptions are available on our Careers page; applications should be submitted online. To become part of our team, please contact Dr. Peter Rose (peter.rose@rcsb.org).

Month Unique Visitors Visits Bandwidth
January 2015 306,978 708,766 2883.29 GB
February 2015 317,313 697,954 3299.35 GB
March 2015 354,060 826,464 3491.21 GB

Jmol rendering of biotin in entry 1STP as created using the ligand tab option at rcsb.org.

The RCSB Protein Data Bank: views of structural biology for basic and applied research and education (2015) Peter W. Rose, Andreas Prlić , Chunxiao Bi, Wolfgang F. Bluhm, Cole H. Christie, Shuchismita Dutta, Rachel Kramer Green, David S. Goodsell, John D. Westbrook, Jesse Woo, Jasmine Young, Christine Zardecki, Helen M. Berman, Philip E. Bourne, Stephen K. Burley Nucleic Acids Research 43: D345-356 doi:10.1093/nar/gku1214

In an update to publications in the Nucleic Acids Research Database Issue series, this paper described how RCSB PDB has focused recent efforts on enabling a deeper understanding of structural biology and providing new structural views of biology that support both basic and applied research and education. Described are recently introduced data annotations including integration with external biological resources, such as gene and drug databases, new visualization tools and improved support for the mobile web. We also describe access to data files, web services and open access software components to enable software developers to more effectively mine the PDB archive and related annotations.

Figure 1a from Bliven et al., reprinted with permission from Oxford University Press. CE-CP alignment of the dynamin A GTPase domain (yellow and red, SCOP:d1jwyb_) and the YjeQ protein (blue and green, SCOP:d1u0la2). N- and C-termini are shown with blue and red spheres. Arrows indicate the positions of the circular permutation.

Detection of Circular Permutations within Protein Structures using CE-CP Spencer E. Bliven, Philip E. Bourne, Andreas Prlić (2014) Bioinformatics doi:10.1093/bioinformatics/btu823

This publication describes the Combinatorial Extension for Circular Permutations algorithm (CE-CP) that allows structural comparison of circularly permuted proteins. Pairwise alignments can be visualized both in a desktop application or on the web using Jmol and exported to other programs in a variety of formats. CE-CP can be accessed through the RCSB PDB website. Source code is available at GitHub as part of BioJava.

BioJava is an open source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats, reference implementations of popular algorithms, and support manipulation of sequences and 3D structures.

Through BioJava, RCSB PDB releases algorithms and file parsers used at rcsb.org, including algorithms used in the protein Comparison Tool, some of the tracks of the Protein Feature View, and the algorithm for detecting symmetry in biological assemblies.

BioJava also contains a reference implementation for parsing and processing PDBx/mmCIF format data files.

A BioJava tutorial is available to help facilitate rapid application development for bioinformatics.

BioJava: an open-source framework for bioinformatics in 2012. Andreas PrliPrlić, Andrew Yates, Spencer E. Bliven, Peter W. Rose, Julius Jacobsen, Peter V. Troshin, Mark Chapman, Jianjiong Gao, Chuan Hock Koh, Sylvain Foisy, Richard Holland, Gediminas Rimša, Michael L. Heuer, H. Brandstätter–Müller, Philip E. Bourne, Scooter Willis Bioinformatics (2012) 28: 2693-2695 doi: 10.1093/bioinformatics/bts494


RCSB PDB Mobile is a universal app that enables the general public, researchers and scholars to search the PDB and visualize protein structures using mobile devices.

Freely available for the iPhone/iPod/iPad and Android (2.3.3 and above), RCSB PDB Mobile can be used to search the entire PDB database, view the latest released structures, access MyPDB accounts, and view the entire catalog of Molecule of the Month articles.

A program update has been released at the Apple Store and Google play which includes the latest Molecule of the Month article and bug fixes for recent indexing issues.

Known issues will be listed on the RCSB PDB Mobile support page. Currently, the molecular viewer used in the app (NDKmol) cannot support large structures with more than 62 chains and/or 99,999 atoms, and cannot be used on Android 5.0 (Lollipop). The app will be updated as new versions are made available.

Effective April 10th 2015, weekly public release of data from the PDB archive will be divided into two phases to serve better the needs of methods developers focused on protein structure prediction and protein-ligand docking. Going forward on a weekly basis, these developer communities will have ~4 days during which they can make blind predictions of protein or nucleic acid structure from polymer sequence and ligand docking pose from polymer sequence and the InChI string of bound ligand.

Phase I: Every Saturday by 3:00 UTC, for every new entry, the following will be provided from the wwPDB website: sequence(s) (amino acid or nucleotide) for each distinct polymer and, where appropriate, the InChI string(s) for each distinct ligand.

Phase II: Every Wednesday by 00:00 UTC, all new and modified data entries will be updated at each of the wwPDB FTP sites.

This change is being made with the advice and concurrence of the Advisory Committee to the Worldwide Protein Data Bank.

For more information, visit wwPDB News.

A snapshot of the PDB archive (ftp://ftp.wwpdb.org) as of January 2, 2015 has been added to ftp://snapshots.wwpdb.org/. Snapshots have been archived annually since January 2005 to provide readily identifiable data sets for research on the PDB archive.

The directory 20150102 includes the 105,465 experimentally-determined coordinate files and related experimental data that were available at that time. Coordinate data are available in PDB, mmCIF, and XML formats.

The date and time stamp of each file indicates the last time the file was modified. The snapshot is 438 GB.

The script at ftp://snapshots.wwpdb.org/rsyncSnapshots.sh may be used to make a local copy of a snapshot or sections of the snapshot.