| IBM Visualization
|
|
A class of drugs which target HIV-I reverse transcriptase, called non-nucleoside inhibitors, all bind to a specific pocket on the enzyme. Shown is a closeup of the binding of the drug TIBO-R86183 inside this pocket. The electrostatic characteristics of reverse transcriptase in the vicinity of the non-nucleoside inhibitor binding site is also shown. The electrostatic potential is mapped onto a solvent-accessible surface, along with some representative electric field lines. (0.22MB GIF) |
|
Though it is a science that endeavors to explain phenomena in terms of fundamental entities too small to be seen by the naked eye, chemistry has become one of the most visually dependent sciences. Perhaps since the days when August Kekulé envisaged the molecular structure of benzene as a snake swallowing its tail, chemistry and its allied science and engineering disciplines have sought to represent the structures and properties of chemical systems by simplified models which are meaningful to the viewer yet retain some of the essential physics.
Today, no textbook, publication, or seminar is complete without the Corey-Pauling-Kultun (CPK) model, licorice model, ball & stick structure, Connelly surface, beta sheet, polypeptide ribbon, dumbbell p-orbital, Ortep diagram, phase diagram, Walsh diagram, electron density contour, or reaction surface. In each case, visualization is the means by which significant information is conveyed and in a manner which optimally promotes comprehension.
As our understanding of the chemical sciences advances, we become even more reliant on visualization as a scientific tool. Scientific images are more than just pictorials, they are the embodiment of properly collected data. As such, exploratory scientific visualization takes its place beside spectrometry, synthesis, theory, and computation as a valid and useful tool in the chemical sciences.
This brief details some examples of exploratory and pedagogical visualizations in various areas of chemistry, as well as in the closely allied fields of materials science, molecular physics, biochemistry, and pharmacology. All the examples included in this brief are derived from research conducted at the IBM T.J. Watson Research Center and have been generated with the general purpose, data flow, visualization software from IBM -- Visualization Data Explorer.
The hierarchical nature of the datastructure used by Data Explorer facilitates the diverse multiplicity of chemical applications. Chemical data can be scalar, vector, tensor, quarternion, or string and with surprisingly varied dimensionality. Each of these, however, is readily handled within Data Explorer. The logistics for representing an arbitray dataset involves successive bundling. That is to say, numerical values are bundled together as objects, objects can be bundled together as fields, fields as groups, groups as "groups of groups", and so on. Of these structures, the field is generally the minimal structure required to create a visualization, and each of the objects of which it is composed constitutes a component of the field. Two ever present component objects of fields are the data component and the positions component, where each datum value(of arbitray tensorial rank), is associated with a unique position (of any dimensionality). The name positions is used in a generalized sense and does not necessarily refer to spatial coordinates. Rather, the relationship between data. and positions can be best characterized as a mathematical mapping,
| f | ||
| positions | -------> | data |
For example, a surface of free energy, G(T,P), would have values of G as data and pairs of T and P values as positions.
Datasets are also classified based upon the organization of the positions component, specifically: regular grids, deformed regular grids, and irregular grids. Chemical datasets span the entire gamut, from completely regular datasets to wholly irregular datasets.
The three dimensional representation of a molecule as a ball and stick structure, is an example of what could otherwise be a problematic dataset: irregular data. Yet molecular structures lend themselves quite naturally to description with this datastructure, with chemical bonds represented by the connections component, which otherwise is used to assign interpolative elements between positions. The description of the molecular structure of methane, for example, possibly containing additional information such as point charges, might take the following format:
#Data Set for Methane #Definition of the field OBJECT "CH4" CLASS field COMPONENT "data" VALUE 1 COMPONENT "positions" VALUE 2 COMPONENT "connections"VALUE 3 COMPONENT "charges" VALUE 4 #Data Component: Atomic Numbers OBJECT 1 CLASS array TYPE float RANK 0 ITEMS 5 DATA FOLLOWS 6.0 1.0 1.0 1.0 1.0 #Positions Component: Coords OBJECT 2 CLASS array TYPE float RANK 1 SHAPE 3 ITEMS 5 DATA FOLLOWS -.00371 .00259 -.00473 1.05193 .21342 .17034 -.35787 -.71558 .73548 -.12978 -.41363 -1.00361 -.57703 .92500 .08118 #Connections Component: Bonds OBJECT 3 CLASS array TYPE int RANK 1 SHAPE 2 ITEMS 4 DATA FOLLOWS 0 1 0 2 0 3 0 4 #Data-Specific Comp: Pt Charges OBJECT 4 CLASS array TYPE float RANK 0 ITEMS 5 DATA FOLLOWS -0.2 0.2 0.2 0.2 0.2 END |
Using the atomic numbers as the data component, it is then possible within the visual program to index the atoms into maps of of elemental colors and Van der Waal's radii so as to distinguish the atoms by color and size.
Having discussed Data Explorer's robustness for chemical applications, the remainder of this brief is devoted to exemplifying how each of the chemical disciplines can benefit from visualization.
Considerable effort goes into the determination of atomic positions in the elucidation of crystal structures. Since structure is often a telling indicator of physical and chemical properties, it is valuable to faithfully depict 3-D representations of crystal structures.
The image shown represents the results of a single crystal x_ray diffraction structure determination of a novel new material, a member an entire class of conductors called "tunable perovskites." [1] It is an organic based, layered, tin-iodide material which is electrically conducting within the prominent planes of octahedra (pink) and shares many structural similarities with high temperature superconductors.
Polyhedral representations, such as the image above, are common in structural chemistry as they serve to elucidate the coordination around metal atoms and to highlight important global structural features. The octahedra were created using two user-defined "glyphs", which represented the two independent coordination environments of the tin atoms. The central layers were intentionally made partially transparent so as to emphasize the bonding arrangement of the iodine atoms(green) to the tin atoms(white).
The goal of materials science is to engineer materials for specific technological applications based upon an understanding of the microscopic behavior of condensed matter that gives rise to various physico-chemical properties, such as optical, rheologic, tribologic, electronic and magnetic properties. While single crystal materials continue to be relevant, research increasingly has turned its attention to less ordered systems such as doped crystals, liquid crystals, composites, amorphous materials, and granular materials. The image (right) is of a granular material and is derived from backscattered Kikuchi diffraction (BKD) data. It is an example of 2-D regular data with missing values. The BKD technique identifies single crystal grains among a population of grains in a sample, and provides indexing of the crystallographic orientation of grains which can be useful in the correlation of local microstructure with quality assessment measures. [2]
While it could be considered a part of materials research, surface science has developed into its own area of specialization. It is therefore not surprising that the first images of individual atoms have been obtained by surface scientists because there has been such extensive progress in the field, due largely to the refinement of ultrahigh vacuum methodolgy and the development of a large assortment of surface analysis techniques -- LEED, RHEED, EELS, SEXAFS, STM, AFM, SEM, PhD, to name a few. This progress has been driven in large measure by the lure of advanced technological surface applications such as electronic devices, data storage devices, heterogeneous catalysts, lubricants, and non-corrosive finishes.
The rich chemistry which occurs on surfaces arises because the chemical constituents that comprise the surface are comparatively less coordinated by neighbors than their counterparts in the interior bulk. The elucidation and manipulation of surface structures is an important component of surface research that is made all the more effective by 3-D visualization.
This is evident in images derived from Scanning Tunneling Microscopy (STM), a technique which has significantly enhanced our understanding and perception of surfaces. Invented by the 1986 Nobel laureates Gerhard Binning and Heinrich Rohrer of IBM Research, STM images surfaces by probing the local density of electronic states under a scanning metal tip.
The image (left) shows steps on a gold surface as imaged with STM. In this implementation of STM, the highest peaks correspond to greatest overlap between the wavefunctions of the surface and that of the scanning tip, not necessarily to the locations of atoms. The raw data which is used to create the visualization is an example of two dimensional, regular data which has been byte normalized. The other image (right) is a schematic of the electronic states of electrons confined between these steps. It illustrates the quantum confinement of the electrons across a step and the free conductivity down the length of the step, creating what has been called a "quantum wire." [3]
From the abstract space of all chemical compounds, both known and unknown, one seeks to find specifically shaped molecules (potential drugs) which optimally bind to the molecular surfaces, folds, or crevices of larger biological targets so as to produce some desired regulatory effect. This use of small molecule xenobiotics to control some pathochemistry of larger biomolecular systems is the goal of pharmaceutical drug design.
Visualization can be a valuable adjunct to the overall process of rational
drug design. It can be used to flesh-out correlations in statistical data,
to monitor the progress of molecular datamining queries,
to give meaning to the structure-activity relationships in
QSAR training sets, and to explore the nature of the physical
interaction between drug and biological target.
Even if a potential drug has the correct shape for binding to a biomolecular target, it still requires the appropriate interactions to bring it and hold it there. The image shown is a representation of the electrostatic behavior of a potential drug as illustrated by the use of streamlines to depict the lines of force. A unique point in the charge distribution of the molecule, (the pink sphere eccentrically located within the phenyl ring) called the center of dipole, is the appropriate origin to use to characterize this distribution [4]. By varying the spherical surface used to generate the streamlines, it is possible to explore the distance dependence of the electrostatic field. It can be seen that at 10 angstroms the interactions appear very nearly dipolar, whereas at distances of around 5 angstroms (not shown) the "near-field" streamlines would depart markedly from a dipole.