Structure Elucidation of Unknown Metabolites (1) – Problem Description

In preparation for a talk about structure elucidation of unknown chemical compounds at AISBM in Paris in October, I laying out a bit of the work that others as well as my group have done over the past 40 years in this area. The topic of this AISBM meeting is “Challenges and advances in the annotation and de novo identification of small molecules of biological origin”. I am going to address the problem of computer-assisted structure elucidation (CASE) of unknown compounds in organic chemistry in general and refer to the sub-problem of natural products and metabolites when needed.

Generally speaking, we are talking about the problem where you have an evidence that there is a compound in a biological system or your flask but you don’t know the structure of it. By structure, I mean ideally the fully defined stereo isomer but at least the fully defined constitutional isomer.

I have reviewed the problem of computer-assisted structure elucidation (CASE) a few times in the context of Natural Products Structure Elucidation (Steinbeck 2001, Steinbeck 2004).

Assume, for example, that you are working on a newly discovered medicinal plant and want to discover the compound or set of compounds responsible for the activity.

Willow Tree (Courtesy of Wikipedia). The bark contains Salicylic Acid and was used for a long time as a pain reliever and fever reliever.

How  could one find out that the structure of the active ingredient is as follows?

Constitutional formula of salicylic acid (Courtesy of Wikipedia)

The evidence you’ve got could come from a chromatographic experiment where you have become interested in a particular peak that shows a biological activity.

HPLC chromatogram of a perfume mixture (courtesy of Wikipedia). Each of those peaks is at least one compound. What is the structure of the compound under the leftmost signal?

In our context, the information for determining this information comes from spectroscopic information – NMR and/or Mass Spectrometry (MS). In order to retrieve this information, we need to isolate the compound using separation techniques such as HPLC or use hyphenation techniques.

In the next post, I will elaborate on some of case scenarios that we might be facing in structure elucidation.


Steinbeck, C. “The Automation of Natural Product Structure Elucidation..” Current Opinion in Drug Discovery and Development 4.3 (2001): 338–342. Print.

Steinbeck, C. “Recent Developments in Automated Structure Elucidation of Natural Products.” Natural Product Reports 21.4 (2004): 512–518. Print.

DIY Bookscanner Assembly

I have written about the arrival and packaging of my bookscanner kit from in a previous post.

Here I document the assembly, which was really easy due to the excellent videos by Daniel Reetz on YouTube. I recommend watching them all before even starting to do anything with the kit. Regarding tools needed for the assembly: An electric drill and an electric screw driver are really handy. You’ll need a Philips and a hex tip screwdriver. A rubber hammer is useful but a normal hammer and a piece of scrap wood will do. I am not going to repeat what Daniel has documented so perfectly but just show a series of pictures that document the assembly. This is essentially done in an afternoon.

The Base

The Scanner Base










The cradle base

The Cradle Base











Lever Arms with Skateboard Bearings













Base with Cradle Base and Lever Mechanism assembled

Base with Cradle Base and Lever Mechanism assembled













Cradle (the part that holds the actual book)

Cradle (the part that holds the actual book)










Lighting Module with LED Light

Lighting Module with LED Light










The Imaging Module, which will hold the glass and the cameras

The Imaging Module, which will hold the glass and the cameras











The whole BookScanner put together

The whole BookScanner put together






DIY Bookscanner Kit arrived

I love books. Real books, made of paper, ideally hard cover and properly bound with needle and thread. And I have assembled a nice library over time.

On the other hand, due to peculiar choices of where to live and where to work, and due to the nature of my job, I essentially live on the road and in the air. That is incompatible with carrying around heavy, beautiful and sensitive books.

Over the years I have more and more reduced the weight of my backpack and recent improvements in  offline reading apps for the magazines I read (C’t, Make Magazine, Nature) make it possible that I can now read all of them in the air and offline on my iPad without carrying around the paper and its weight. At any given time, however, I am also reading a book from my tall FIFO stack of books. Buying them as ebooks is not an option for me, due to my love for physical books and other concerns about DRM and loss of access.

Due to another passion, Makeing, Open Source Software and Open Hardware, I came across the fantastic DIYBookScanner project. After Daniel Reetz, a key figure of this movement, released the plans for an open hardware standard bookscanner kit, I knew it was time. Because of time constraints, I decided to buy pre-cut parts from I hope to build one from scratch soon at MakeSpace Cambridge.

But for now I would like to report on my progress with the book scanner kit from I decided to buy the base kit and get the cameras, foot pedal and USB hub from Amazon.

The kit arrived in a nice, light and compact package. The purchase includes video chat support for the assembly, which I didn’t need due to the extensive online documentation.

DYIBookscanner parcel








Parts where each wrapped in shrink wrap.





















I will report on putting this together in a following post. For now I would like to congratulate the folks at for their excellent product.

Bioinformatics PhD opening for UK national

Towards a better understanding of lipid metabolism through studies of Drosophila Lipidomics

Christoph Steinbeck, Julian Griffin, Steve Russell

Fruit fly (Drosophila melanogaster, male), Courtesy of Max Westby

This project will bring together Drosophila experts at the University of Cambridge (Prof. Steve Russell, Dept of Genetics and Cambridge Systems Biology Centre), the Lipidomics lab led by Julian Griffin at the MRC Unit for Human Nutrition Research/Biochemistry, University of Cambridge and the Computational Metabolomics Group led by Christoph Steinbeck at the European Bioinformatics Institute in Hinxton to extensively study the lipidomics of Drosophila. The student will grow up a number of Drosophila mutant models related to lipid metabolism, isolate fat pads and perform analysis using high resolution mass spectrometry. The data will be modelled within the Computational Metabolomics Group to identify differences between the mutants, and explore how the Drosophila lipidome is regulated.

For this BBSRC-funded project, only candidates with a UK passport are eligible. Please email if you are interested.

In Memory of Open Science Pioneer Jean-Claude Bradley

JCBradley-Blue-Obelisk. License CC-BY

Jean-Claude Bradley receiving the Blue Obelisk award presented by Egon Willighagen in an improvised ceremony during the ACS meeting in Chicago 2007 (Chris Steinbeck behind the camera). Picture licensed CC-BY.

As announced today by Drexel University, Open Science Pioneer and recipient of the Blue Obelisk award Jean-Claude Bradley died yesterday. He was a member of the editorial board of the Journal of Cheminformatic and Editor-in-Chief of the Chemistry Central Journal. But most importantly, he was an inspiring scientist and evangelist of the open science and open access movement.


Open PhD position in Steinbeck group (Cheminformatics and Metabolism)

Where we live: EBI Main Building on Genome Campus

Where we live: EBI Main Building on Genome Campus

We have an opening for a Ph.D. position in Chris Steinbeck’s cheminformatics and metabolism team at the European Bioinformatics Institute (EBI) in Cambridge, UK.

Ph.D. topics are available in a wide range of areas such as analysis of metabolomics experiments, metabolism, computational natural product biochemistry, elucidation of natural products, text mining and image processing, and more. Applications should be submitted through the EMBL International Ph.D. programme.  Ph.D. students successfully pursuing and completing their projects will retrieve their Ph.D. from the University of Cambridge, UK.

The group leaders at the EBI recruiting in this round are Sarah Teichman, Alex Bateman, John Overington and Chris Steinbeck. Candidates invited to the final assessment have the opportunity to consider all of the recruiting groups. The registration and submission deadlines are approaching quickly.


More personal

Courtesy of

This blog has been mostly about my work so far and was labelled “A molecular-informatics weblog”. For a while, however, my group has also been running its own blog at and I will use this group blog from now on for postings about bioinformatics, cheminformatics, metabolism, etc.

So from now on, this blog will be about person stuff and will be more diverse. I will make sure that the blog is not listed in, e.g. the chemical blog space anymore.

Hope that some of you will still be interested in the posts 🙂

Two new thematic series in the Journal of Cheminformatics

As reported in a previous post, the  Journal of Cheminformatics  has received its first Impact Factor of 3.42.  JEditors-in-Chief David Wild (Indiana University, US) and Christoph Steinbeck (European Bioinformatics Institute, UK) commented: “Our aim when we started Journal of Cheminformatics in 2009, was to impact a wide audience with high quality, interesting and relevant cheminformatics research.  Three years on, we are delighted at our progress, and we believe our Impact Factor of 3.42 (very high for a first Impact Factor for a new journal) demonstrates the difference our journal is making in the field of cheminformatics and beyond, into other related disciplines of science.  Being Open Access, our papers can be read by a wide range of researchers, scientists in industry and independent practitioners.” We also believe that the papers in the following two new thematic series will have a considerable amount of attention from our readers and will contribute to the growing success of the journal in the future.

Announcing Two New Themed Series of Papers, to Be Published in 2012

Semantic Physical Sciences –Fall 2012

ImageGuest Editors:
Peter Murray-Rust, University of Cambridge
Henry Rzepa, Imperial College London
2012 Herman Skolnik Award winners


  • A series of papers arising from an invited workshop and symposium to investigate and formalize the use of semantics in physical sciences
  • Applying primary technologies based on chemical mark-up language (CML) and MathML to create fully semantic declarative scientific objects
  • Represents output from groups such as CSIRO, PNNL, STFC, Kitware, IUCr, the Blue Obelisk, and the Unilever Centre for Molecular Informatics

The InChI Project –Winter 2012

ImageGuest Editor:
Antony Williams, Royal Society of Chemistry


  • A series of papers describing the applications and utility of the IUPAC International Chemical Identifier (InChI)
  • Reviews the need for a standard identifier in chemistry, the development of InChI, and its applications, limitations and future developments
  • Publications will report on the perspectives and research of academia, government labs, publishers, pharmaceutical companies and others

Journal of Cheminformatics has previously published themed series of papers covering:

To find out more and to stay up-to-date

Register for updates  /  Read our Blog  /  Follow us on Twitter

Journal of Cheminformatics receives Impact Factor of 3.42

We just wanted to update you with some good news from the Journal of Cheminformatics. The Thomson / ISI 2011 Journal Impact Factors were just released, and the Journal of Cheminformatics received an Impact Factor of 3.42 – the first year we have been given an impact factor. This is very high for a new journal, and reflects our commitment to publishing interesting, important and high quality research with a wide scope of application in cheminformatics and related fields, with maximum accessibility. The impact factor compares very favourably with other much more established journals, both open access and subscription (e.g. BMC Bioinformatics = 2.75; PLoS ONE = 4.092).

Of course Impact Factors are not everything, and there has been much discussion recently about article-level metrics being more informative than proprietary journal-level metrics. We’re thus pleased that the Journal of Cheminformatics is now supplying a variety of article-level statistics in the “about this article” section, including the increasingly discussed altmetric score (

So thanks to all of you who made the Journal of Cheminformatics such a success, and we look forward to a bright future for the journal!

David Wild & Chris Steinbeck
Editors-in-Chief, Journal of Cheminformatics