PDB EDUCATION CORNER: A short history of visualizing structures in the PDB by Judith Voet, J. H. Hammons Professor , Swarthmore College
he Protein Data Bank is now 34 years old. The last Education Corner article highlighted a mural depicting some of the first protein structures to be determined. The study of protein structure and function is filled with such images, and the use of computer graphics software has become vital for their visualization and analysis. Before the advent of computer graphics, building a 3D model of a protein was such an intensive effort that it would often take a class the whole semester to complete. The use of computers sophisticated enough for such visualization and analysis by the general educational community is a relatively recent phenomenon, although it is now pervasive. Education Corner articles have described their use in courses ranging from high school to graduate school. But it wasn't always that way. To get an idea of just how recent this phenomenon is, I thought I'd reminisce a bit on my own introduction to the world of molecular graphics and the PDB.
I have been surrounded by computers for my whole professional life, but only succeeded in feeling comfortable with one when the Macintosh, with its miraculous mouse, arrived at Swarthmore College (and on my desk at home) in 1983. I was never able to remember the keyboard commands required for almost any action before the advent of the menu from which to select my choice. The first draft of the first edition of my textbook Biochemistry, co-authored with Donald Voet, was hand-written and then typed into a mainframe computer by an assistant using a dumb terminal (word processing programs had yet to be invented). When I got my first Macintosh, the whole world opened up for me well, almost. Writing by hand with its difficult revision process was now a thing of the past, but I was still unable to visualize macromolecules on my computer. The molecular graphics software and color monitors necessary for the visualization of a PDB file still only operated on specialized and expensive graphics computers. At this time, there were around 200 structures available. This small number of structures, combined with the difficulty and expense of obtaining a molecular graphics computer, kept the PDB out of the hands of the general education community. By 1990, when the first edition of Biochemistry was published, structure visualization still required more sophisticated and expensive computers than were available for purely educational purposes. If we wanted to use a figure of a molecular structure that we had seen in a journal article in the textbook, we most often obtained it from the author. I remember our excitement when we had one structure generated for us by a colleague with a molecular graphics computer.
In 1993, while working with Joel Sussman at the Weizmann Institute in Rehovot, Israel, I had access to a molecular graphics computer for the first time and began to see the power of the PDB, not just as a storage facility for 3D coordinates, but as an interactive tool for search and discovery. At about the same time, Michael Levitt created a piece of software, MacImdad®, that allowed PDB files to be displayed and manipulated on a Macintosh computer. As it happened, at that time Levitt had a joint appointment at Stanford University and at the Weizmann Institute. For most of the time I was in Israel, he was in the United States, and I was assigned his desk and his Mac. The connection was too much to be ignored. I became a disciple of MacImdad. In 1994, on my return from Israel, I obtained the program for Swarthmore College, installed it on several Macs in the biochemistry laboratory and my students began to use computers to visualize protein structures and reaction mechanisms for the first time. Still, it was not easy to obtain the original 3D coordinates from the PDB. Believe it or not, there was not yet an easily accessible internet! Transfer of PDB files required access to an ftp site, still the province only of mainframe computers. The PDB would provide compact disks containing their data on request, but there was an energy barrier to obtaining this data. If you had a specific protein you wanted to study, most likely you wrote to the researcher who had determined the structure and asked him/her to send you the coordinates. MacImdad was wonderful because it came with all the data contained in the PDB at that time, in a compressed form accessible by the program.
Also in the same timeframe, Jane and David Richardson developed another Macintosh-friendly format for visualizing molecular structuresthe Kinemage (short for Kinetic Image). Kinemages are displayed using the freely available program MAGE that is usable on most computer platforms. The Richardsons created many Kinemages that were extremely valuable for educational use in visualizing and understanding macromolecular structure and function. I availed myself of several kinemages and of the MAGE program for use in the biochemistry laboratory.
Due to the generosity of the Richardsons, John Wiley & Sons produced and distributed a supplementary disk to accompany the second edition of Biochemistry (1995) that contained many Kinemages that we generated to help visualize and interpret the molecular structures contained in the text. The creation of each Kinemage required the use of PDB files of the 3D coordinates of the macromolecule under study.
A third addition to my accessible molecular visualization software list was RasMol®, developed by Roger Sayle at MDL® (www.mdl.com) and donated to the scientific community. By 1998, I was back in Joel Sussman's laboratory and working on Fundamentals of Biochemistry (D. Voet, J.G. Voet and C. Pratt), the internet had made its phenomenal appearance, and we found a great web-based tool for macromolecular visualization: MDL Chime. This plug-in was based on RasMol, but ran directly on a web browser. Eric Martz created a website and many support documents to guide educators in its use for the development of visualization exercises for students, and also developed the very powerful molecular visualization tool Protein Explorer (www.umass.edu/microbio/rasmol). At this writing, MDL is no longer supporting Chime, but a Java-based, platform-independent successor, Jmol (jmol.sourceforge.net), promises to be even better. Kinemages can now be manipulated directly on the Web using KiNG (kinemage.biochem.duke.edu).
The ability to visualize and manipulate macromolecules using relatively low level computers is now pervasive and included in most biochemistry classes as a matter of course. Most textbooks come with ancillary materials that allow students to examine and manipulate molecules using resources available at the publishers' websites. This growth in visualization capability would not be nearly as much use without the parallel growth in the ease of use and availability of the PDB. Hand in hand these tools and resources have developed a whole new way of teaching and learning about molecular structure that was only a dream 25 years ago.