Reflecting on Data at the ACS National Meeting

As Science Faculty Librarian at University of Bath, U.K., I have built up a wealth of knowledge of chemical information sources and experience in supporting chemistry undergraduates and researchers.  Much of my knowledge has been gained through effective networking with my peers, academic chemists, and publishers.  This year, the ACS offered an unprecedented opportunity for chemistry librarians outside of the U.S. to travel to the 256th ACS National Meeting & Exposition in Boston, with a travel grant. There couldn’t be a better opportunity than this for networking and learning from my peers.  I feel very privileged to have been the first U.K. librarian to benefit from this generous support.

My main aim in attending the conference was to meet my fellow chemistry librarians, to discover which areas of work they are engaged in, to pick up helpful tips to make my practice more effective and to identify any potential areas for collaboration.  I also hoped that the conference would help to bring me up-to-date in the rapidly developing field of open chemistry research data.

I was completely overwhelmed by the size of the event: I’m told there were 14,000 attendees!  The program was vast with numerous parallel sessions. Fortunately, the ACS provided a useful app which helped to identify and schedule sessions.  Each day was full, with an 8:30 AM start!  I decided to focus on attending the CINF (ACS Division of Chemical Information) sessions.

The first session I attended was “Chemistry librarians of the future.” Rather than dealing with the traditional aspects of the role such as collection management, teaching, and literature searching, there was a focus on newly evolving areas of work.  Nicolas Ruhs (Florida State University) has been looking for unique opportunities to contribute and has created a resource for discovering and citing laboratory equipment.  He’d also identified a gap in support for chemistry software, as I have, and we are both working on Libguides for software. I look forward to comparing our results.

Judith Currano (University of Pennsylvania) spends much of her time teaching researchers, so she has the opportunity to engage with them closely.  Through these conversations, she identifies what she calls ‘pain points,’ and considers where she can provide support that isn’t otherwise available.  Through this, she has become involved in teaching research ethics.  But she gave a word of caution: When thinking about what you can offer, also think about what you should offer. It’s easy to become overstretched.  Mary Schlembach (University of Illinois) had an interesting suggestion for making good use of empty library space during the summer: she hosts the International Symposium on Molecular Spectroscopy.  The circulation desk becomes a registration desk, and she creates a ‘conversation pit’ and exhibition space.  The library also archives and mints DOIs of the presentations.

The subsequent discussion focused on what we need for the future.  Everyone agreed it was more money and more staff, but we were unlikely to get either!  The best advice was to be strategic in how you spend your time. Sometimes you have to let things go to do something new.  It was also suggested that we could make more use of undergraduate or postgraduate students to support our work.

The second session I attended was around “Publishing chemical data.”  Over the past few years in the U.K. there has been a huge growth in library support for open access and research data management.  This is clearly also the case in the U.S., and I was extremely impressed with the level of engagement of the chemistry librarian community in the area of open chemistry data.  For example, Leah McEwen (Cornell University) is working with IUPAC on developing data standards.

In the U.K., research funders have been developing data policies, and in general, it’s expected that data which underpins a publication will be made available.  Much of this data will either be placed in the publisher’s supporting information or an institutional data repository or a generic repository such as FigShare.  Although it is a step in the right direction, there is a long way to go before we solve the problem of making experimental chemical data both reproducible and reusable.

ACS has been making progress towards ensuring that data published in their journals is reproducible. Angela Hunter (data analyst with Organic Letters) has been reviewing the Supporting Information (SI).  Among the problems encountered are missing pieces of data such as experimental procedures, data linked to the wrong compound, and data in the SI being inconsistent with that in the manuscript.  There is now an SI preparation checklist for authors and improved guidelines for reviewers.  Perhaps we have a role to play in making authors and reviewers aware of this guidance, as an example of current best practice, even if they are submitting work to other journals.  I aim to take this message back to our new postgraduate students: For your data to be reproducible, you need to archive your raw data, work on a copy, and document how you do your data analysis.

Certain sections of the community have been archiving and re-using data for many years. The Cambridge Crystallographic Data Centre is over 50 years old!  But there is a real need for sharing data in other areas, in particular, bioactivity data to facilitate drug discovery, and spectroscopic data to support the validation of compounds.

NMR data are essential for the correct identification of organic molecules. At present, most NMR data are published in tabular form in article text and edited to remove impurities. Raw data is rarely available.  Christoph Steinbeck (Friedrich Schiller University) summed up what was needed as follows:

  • a minimum information standard
  • software for handling the submission of raw data and metadata
  • an interface which can process, search and visualize the data for users

All of these issues can be overcome relatively easily. The major obstacle is community buy-in. Although there would be long-term benefits for the community, there is no instant benefit to the depositor.  I think that chemistry librarians have a key role to play here, once the infrastructure is in place. We can use our influence to explain what needs to be done and why, and hopefully make it as simple as possible, perhaps even having some involvement in the deposition process.  Publishers could certainly help by mandating deposit of raw NMR data, but of course no one wants to be the first in case authors are put off by the extra work!

Many small data repositories have sprung up, and they are extremely valuable in helping to develop metadata standards and technical infrastructure.  But in my opinion the sheer number of projects discourages community buy-in: it’s too difficult to identify the ‘right place’ to deposit data.  Richard Kidd (Royal Society of Chemistry) illustrated this beautifully with a picture of the Cambrian era. There was a blossoming of new life, not all of which would evolve into successful forms for the future.  Sustainability (long-term funding) is an issue for data repositories.  I came away with a long list of websites relating to open chemical data, and I hope to raise awareness of some of these initiatives with my library colleagues and chemistry faculty back in the U.K.

I’d like to thank the ACS for this amazing opportunity and in particular Michael Qui for all the support and advice he gave to me and my fellow grant-awardees, Lieselot, Kelsey and Alyssa.  It was great to meet you all!  Boston was a wonderful city to visit, and I found it very welcoming, even though the major tourist attractions all celebrated the eviction of the English from the continent!

Get stories like this one in your inbox every month. Sign up for a custom newsletter from ACS Axial.

If you have comments or questions for the author of this post, please e-mail: Axial@acs.org.