This August, I was one of 22 scientists to attend the SciFinder® Future Leaders program. We spent a week visiting the Columbus, Ohio, offices of Chemical Abstracts Service (CAS), a division of the American Chemical Society (ACS), before heading to the 254th ACS National Meeting & Exposition in Washington D.C. for another week of scholarly (and fun) activities.
This post is the first of a three-part series on my experiences with the SciFinder Future Leaders program. In part one, I will share with you what I learned while touring the databases of CAS.
Behind the Scenes: Archiving
With a few clicks on SciFinder or other CAS solutions, you will be able to access the rich literature in synthesis methods, chemical structures and patents. But have you ever wondered how this is achieved?
During the Future Leaders program, I learned that CAS receives ~17 million articles from different journals annually (P.S. this number shocked a Vice President at CAS too!) but adds only about 1 million articles to its database each year. This is because there are duplications of the same article, such as ‘Before Prints’ and ‘As Soon As Publishable’ versions.
To start off, incoming articles are sent to their respective departments, such as polymers and metallo-organics. Next, they are read quickly for novel chemical structures or synthesis pathways, which are then added to the database. With the rise of inter-disciplinary research, sometimes articles might find themselves in a few suitable categories. In this case, CAS staff communicate with one another quickly to assign the article to the most suitable category.
As a former librarian, this process didn’t come as a surprise. However, what puzzled me was that the entire cycle of archiving is still performed manually, despite advances in machine learning and data mining. I later learned that as journals and authors have different ways of annotating chemical structures and names, manual archiving is necessary (or at least until machines are intelligent enough to be on par with CAS staff). I was also informed that authors, peer reviewers, or editors might miss something in the paper due to the pressure to publish literature quickly. This is where CAS staff can help by flagging the article for another round of edits.
Behind the Scenes: Data Storage
After the presentations on databases and archiving, we were brought into the control room for SciFinder and the many other CAS solutions. We were also shown CAS’s servers, the tremendously powerful computers that allow SciFinder users to access CAS data remotely. Due to the popularity of SciFinder, a few million dollars are committed to maintain the computing infrastructure annually, excluding the cost of an efficient cooling system that accounts for 52% of the electricity bills. CAS also has its own backup generator in case of power failures. We were also informed that when hurricane-force winds struck Ohio a few years back, remote access services provided by CAS were not affected at all. This is how committed the company is to service their consumers.
I have always enjoyed visiting companies and experiencing what they do behind the scenes to make things possible. What I appreciate the most about visiting the Columbus office is how transparent its staff are in sharing about their work, thoughts, and future directions of the company. I believe it is only through such a relationship that CAS can receive the most honest feedback for its products and services during beta-testing too.
The next time you make a few clicks on your mouse on any CAS solution, appreciate the painstaking efforts of CAS staff that make it possible!
In my next article, I will share more about how the Future Leaders program educates its participants in soft skills, such as entrepreneurship, peer review, and profile marketing. Stay tuned!