Exploring the History of Cheminformatics


There are always new platforms, systems, and tools to keep up with in cheminformatics. On November 16, ACS hosted a webinar, featuring Dr. Wendy A. Warr taking a look at technological advances that have changed chemical research, as well as what the future might hold. Warr is a Chartered Chemist, a Fellow of the Royal Society of Chemistry, a Fellow of the Chartered Institute of Library and Information Professionals, and active in the ACS Chemical Information Division.

Since all scientists build off of the work of those who came before them, a look back at the history of cheminformatics is important, she said.

Cheminformatics & Big Data

Work on CAS REGISTRY began in 1965 under a three-year contract with the National Science Foundation. By 1969, there were more than 1 million chemical structures in the database, and today it has 135 million unique structures. Today, numerous public databases, such as ChEMBL, MEDLINE, and PubChem, are available for anyone to use, Warr said.

Big data is a big deal in cheminformatics today. For example, a screening library in a pharmaceutical company could contain, 250,00 to 1 million compounds, a corporate compound collection could have around 3 million, SureChEMBL has 16 million, Enamine REAL has 170 million, GDB-17 boasts up to 164 billion molecules.

Cheminformatics & Hardware

In the 1960s and 1970s, mainframes were the way to go in computing, Warr said. These computers were big, taking up entire rooms. They were also expensive and generated a lot of heat. These were replaced by minicomputers, which were smaller and faster. The 1984 VAX 11/750 had a clock speed of 6 MHz, 2 MB memory, 134 MB fixed disk, two 67 MB exchangeable disk drives and shared peripherals.t

The advent of microcomputers allowed for front-end online searching via STN Express Warr said. This generation of computers was defined by the features windows, icons, menus and pointers, or WIMP. The earliest example of a microcomputer was the IBM PC 5150, introduced in August 1981. It ran IBM BASIC and PC DOS, had a central processing unit of 4.77 MHz, and 16 to 256 KB of memory.

Though not strictly hardware-related, chemists now also have the option of cloud computing, instead of hosting computational power in-house. Warr noted that the cloud allows smaller organizations that don’t have IT departments to compute using online tools.

Cheminformatics & Structure Editing

“Once your hardware gets better, or cloud computing appears, you can do better software,” said Warr. Chemists worked with Structure Entry NIH/EPA Chemical Information System (CIS) Structure and Nomenclature Search System (SANSS), and CAS Online Text Structure Entry in the 1970s. In the 1980s came chemical word processing, with software such as Wisconsin Interactive Molecular Processor, Molecular Presentation Graphics, Professional Structure Image Database On Microcomputers, MDL’s Chemists’ Personal Software Series. ChemDraw was demonstrated at the ACS meeting in fall 1985. Today, ChemDoodle uses HTML5, which makes it fast and platform independent, she said.

Cheminformatics & Searching

Storing all this data created a need to be able to search through it. Beginning in the 1960s, the first structure searches emerged, including Computerized Retrieval of Organic Structures Based on Wiswesser (CROSSBOW), which featured structure display. Other search tools include SANSS, HTSS, DARC, and also CAS online, which became available in 1980. Later came RS3, S4, CAS REGISTRY MVSS, SciFinder, SciFinder-n and ChemAxon JChem.

We can draw some conclusions from looking at cheminformatics over the past 50 years, Warr suggests. Today it’s really not necessary to build your own system. It’s far easier and less expensive to buy one and customize it to suit your cheminformatics needs. She also pointed out that technology comes in waves, and new leaders emerge when a new wave comes in. And yet there are some leaders from the 1960s who remain strong today, Warr said. Chemical Abstracts was a leader with CAS REGISTRY back in 1965, and then continued to adapt and lead with CAS online, SciFinder and today’s SciFinder-n, she said.

Get stories like this one in your inbox every month. Sign up for a custom newsletter from ACS Axial.

If you have comments or questions for the author of this post, please e-mail: Axial@acs.org.