Open Position for Bioinformatician/Ontologist

We are seeking to recruit an experienced Bioinformatician/Ontologist to work on the eNanoMapper EU project. This project is building a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs) based on open standards, ontologies and an interoperable design to enable a more effective, integrated approach to European research in nanotechnology. You will join the Cheminformatics & Metabolism team at the European Bioinformatics Institute (EMBL-EBI) located at the Wellcome Genome Campus near Cambridge in the UK.

Our group is leading work package 2 of this project, which is developing an ontology for the full domain of nanosafety research, based on existing ontologies and using the standard Semantic Web ontology language OWL. You will work with the eNanoMapper partners on ontology software development and editing, addressing the requirements outlined by the consortium.

The EBI is part of the European Molecular Biology Laboratory (EMBL) and it is a world-leading bioinformatics centre providing biological data to the scientific community with expertise in data storage, analysis and representation. EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academic and industry.

Qualifications and Experience

You will be a capable software engineer, able to work comfortably in Java, and be familiar with the OWL language and associated APIs such as the OWL API. You furthermore need to be familiar with version control (GitHub) and continuous integration systems (Jenkins). You will also need to be able to use the Protégé ontology editing tool to create ontology content according to user requirements.

In addition to your technical expertise some familiarity with the biomedical domain is required; domain expertise in an area closely related to nanosafety would be a bonus.

You will have also experience with:
• Maven
• Batch/Script/Shell programming

You will have a strong interest in ontologies, solid technical skills, and an outgoing, collaborative personality. You must be able to work as part of a team but at the same time be self-driven, trustworthy and able to make progress towards objectives independently. You must also be meticulous and careful, with good communication skills, and willing to travel to European project meetings.


EMBL is an inclusive, equal opportunity employer offering attractive conditions and benefits appropriate to an international research organisation. The remuneration package comprises a competitive salary, a comprehensive pension scheme and health insurance, educational and other family related benefits where applicable, as well as financial support for relocation and installation.

We provide a dynamic, international working environment and have close ties with both the University of Cambridge and the Wellcome Trust Sanger Institute.
EMBL-EBI staff also enjoy excellent sports facilities, a free shuttle bus to Cambridge and other nearby centres, an active sports and social club and an attractive working environment set in 55 acres of parkland.
The initial contract is for a period of 1 year and 4 months with the possibility of a fixed-term extension.

Application Instructions

We welcome applications irrespective of gender and appointment will be based on merit alone. Applications are welcome from all nationalities – visa information will be discussed in more depth with applicants selected for interview.
Please apply online through

Open Positions in PhenoMeNal project (Metabolomics e-Infrastructure)

European Bioinformatics Institute – South Building

We are seeking to recruit a number of experienced people for our PhenoMeNal H2020 grant, to create an international e-infrastructure for large-scale computing and analysis of medical metabolomics data. You will join the Cheminformatics and Metabolism team at the European Bioinformatics Institute (EMBL-EBI) located at the Wellcome Genome Campus near Cambridge in the UK.

We are looking for a full-time project manager, a senior bioinformatician and a software engineer.

This is a project with a budget of €8M, distributed over 13 partners with about 830 person-months / 65 deliverables scheduled over a 3-year period. The PhenoMeNal project will develop and deploy an integrated, secure, permanent, on-demand service-driven, privacy-compliant and sustainable e-infrastructure for the processing, analysis and information-mining of the massive amount of medical molecular phenotyping and genotyping data that will be generated by metabolomics applications now entering research and clinic.

The EBI is part of the European Molecular Biology Laboratory (EMBL) and it is a world-leading bioinformatics centre providing biological data to the scientific community with expertise in data storage, analysis and representation. EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academic and industry.

Please apply via the EMBL online recruiting system following the individual links for the full-time project manager, the senior bioinformatician and the software engineer.


Job offer: Bioinformatician in Metaspace project for untargeted spatial metabolomics

We are looking to recruit a talented bioinformatician to work within the MetaboLights team at the European Bioinformatics Institute (EMBL-EBI) located on the Wellcome Trust Genome Campus near Cambridge in the UK. You will work closely with a consortium of 8 European partners on the METASPACE project.
METASPACE will enable untargeted spatial metabolomics for translational research and clinical applications by providing novel bioinformatics tools, and to demonstrate their potential using several case studies relating to personalised health, precision medicine and quality of life in chronic afflictions.
You will work with other bioinformaticians, domain experts and software engineers on the development of novel database-driven spectral and spatial algorithms, machine learning approach for multiple-mass fingerprinting, development of web services and more.  You will also help coordinating outreach and training, both in terms of online and face-to-face training, in collaboration wit EMBL-EBI’s professional outreach and training team.
The EBI is part of the European Molecular Biology Laboratory (EMBL) and it is a world-leading bioinformatics centre providing biological data to the scientific community with expertise in data storage, analysis and representation. EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academic and industry. We are part of EMBL, Europe’s flagship laboratory for the life sciences.
Please submit your application via the EMBL job site.

nmrML: A vendor-neutral open exchange format for NMR-based metabolomics

Metabolomics is a growing field and the number of organisms being studied is constantly increasing as is the number of metabolites being discovered. With this growth comes a steady increase in the amount of metabolomics data and the need to ensure that we capture this information in persistent open formats and databases.

The COSMOS project (COordination Of Standards In MetabOlomicS, has been created to improve the adoption of open standards for metabolomics data, annotation with agreed metadata, and support by open source data management and capturing tools. COSMOS delivers an ecosystem of formats, tools, and resources such as MetaboLights (, a database for capturing information obtained in metabolomics experiments.

In a spotlight in the December 2014 issue of MetaboNews we introduce recent developments in nmrML (, a vendor-neutral open exchange and storage format to describe NMR-based metabolomics data.

Structure Elucidation of Unknown Metabolites (2) – De Novo vs Look-Up

There has always been a bit of confusion in the terminology around the subject of  computer-assisted structure elucidation (CASE), so let’s define some terms:

  1. Structure Elucidation – Determining the structure of truly unknown compounds de-novo from spectroscopic (or earlier by hard core chemistry :)) experiments
  2. Structure Identification – Recovering the structure of already known compounds from databases or printed reference material (including experimental sections of the primary literature)
  3. Dereplication: Same as 2.

The more we discover, the more likely it will be that we are able to de-replicate by database lookup. This of course requires well curated and developed open access databases that cover many chemical compounds/metabolites.

In organic chemistry, spectroscopic databases for structure identification where published quite early, albeit as closed-access, commercial systems. The most widely used examples is probably the SpecInfo database which now seems to be marketed by Wiley and the more recently (considering the 40-year horizon of the topic :)) published ACD/Labs spectral libaries and management system. Wolfgang Robien in Vienna has been developing NMR spectral databases and prediction tools for a long time.

The general way of searching in such databases would be to measure an NMR spectrum of your isolated unknown compound, perform a peak picking and search the database using this peak picking (a feature vector, if you wish).

In the early 2000’s my Stephan Kuhn in my group developed the NMRShiftDB database which was the first open access, open source, open submission, web-based NMR database where you can now test how this all works without running into pay walls. Stephan has left the lab and now runs version 2 of this database in collaboration with the NMR lab at the department of chemistry at the University of Cologne.

One caveat: It is much easier to search for carbon-13 NMR spectra or mass spectra than for proton NMR spectra. The latter has rarely been addressed, not the least because of the lack of full spectrum proton data to which you could match a real-life proton spectrum. Peak-picking proton NMR spectra is problematic often due to overlap and complex coupling patterns.

Take for example the carbon-13 spectrum of pinocarveol, both from the metabolomics section of BioMagResBank (BMRB). Using your NMR software’s peak picking method, you would end up with this list of NMR signals. If you have a decent browser, such as FireFox, you can use the CMD (on mac) or CTRL key and select the chemical shifts in the table linked above. If not, here they are:






Structure Elucidation of Unknown Metabolites (1) – Problem Description

In preparation for a talk about structure elucidation of unknown chemical compounds at AISBM in Paris in October, I laying out a bit of the work that others as well as my group have done over the past 40 years in this area. The topic of this AISBM meeting is “Challenges and advances in the annotation and de novo identification of small molecules of biological origin”. I am going to address the problem of computer-assisted structure elucidation (CASE) of unknown compounds in organic chemistry in general and refer to the sub-problem of natural products and metabolites when needed.

Generally speaking, we are talking about the problem where you have an evidence that there is a compound in a biological system or your flask but you don’t know the structure of it. By structure, I mean ideally the fully defined stereo isomer but at least the fully defined constitutional isomer.

I have reviewed the problem of computer-assisted structure elucidation (CASE) a few times in the context of Natural Products Structure Elucidation (Steinbeck 2001, Steinbeck 2004).

Assume, for example, that you are working on a newly discovered medicinal plant and want to discover the compound or set of compounds responsible for the activity.

Willow Tree (Courtesy of Wikipedia). The bark contains Salicylic Acid and was used for a long time as a pain reliever and fever reliever.

How  could one find out that the structure of the active ingredient is as follows?

Constitutional formula of salicylic acid (Courtesy of Wikipedia)

The evidence you’ve got could come from a chromatographic experiment where you have become interested in a particular peak that shows a biological activity.

HPLC chromatogram of a perfume mixture (courtesy of Wikipedia). Each of those peaks is at least one compound. What is the structure of the compound under the leftmost signal?

In our context, the information for determining this information comes from spectroscopic information – NMR and/or Mass Spectrometry (MS). In order to retrieve this information, we need to isolate the compound using separation techniques such as HPLC or use hyphenation techniques.

In the next post, I will elaborate on some of case scenarios that we might be facing in structure elucidation.


Steinbeck, C. “The Automation of Natural Product Structure Elucidation..” Current Opinion in Drug Discovery and Development 4.3 (2001): 338–342. Print.

Steinbeck, C. “Recent Developments in Automated Structure Elucidation of Natural Products.” Natural Product Reports 21.4 (2004): 512–518. Print.

DIY Bookscanner Assembly

I have written about the arrival and packaging of my bookscanner kit from in a previous post.

Here I document the assembly, which was really easy due to the excellent videos by Daniel Reetz on YouTube. I recommend watching them all before even starting to do anything with the kit. Regarding tools needed for the assembly: An electric drill and an electric screw driver are really handy. You’ll need a Philips and a hex tip screwdriver. A rubber hammer is useful but a normal hammer and a piece of scrap wood will do. I am not going to repeat what Daniel has documented so perfectly but just show a series of pictures that document the assembly. This is essentially done in an afternoon.

The Base

The Scanner Base










The cradle base

The Cradle Base











Lever Arms with Skateboard Bearings













Base with Cradle Base and Lever Mechanism assembled

Base with Cradle Base and Lever Mechanism assembled













Cradle (the part that holds the actual book)

Cradle (the part that holds the actual book)










Lighting Module with LED Light

Lighting Module with LED Light










The Imaging Module, which will hold the glass and the cameras

The Imaging Module, which will hold the glass and the cameras











The whole BookScanner put together

The whole BookScanner put together






DIY Bookscanner Kit arrived

I love books. Real books, made of paper, ideally hard cover and properly bound with needle and thread. And I have assembled a nice library over time.

On the other hand, due to peculiar choices of where to live and where to work, and due to the nature of my job, I essentially live on the road and in the air. That is incompatible with carrying around heavy, beautiful and sensitive books.

Over the years I have more and more reduced the weight of my backpack and recent improvements in  offline reading apps for the magazines I read (C’t, Make Magazine, Nature) make it possible that I can now read all of them in the air and offline on my iPad without carrying around the paper and its weight. At any given time, however, I am also reading a book from my tall FIFO stack of books. Buying them as ebooks is not an option for me, due to my love for physical books and other concerns about DRM and loss of access.

Due to another passion, Makeing, Open Source Software and Open Hardware, I came across the fantastic DIYBookScanner project. After Daniel Reetz, a key figure of this movement, released the plans for an open hardware standard bookscanner kit, I knew it was time. Because of time constraints, I decided to buy pre-cut parts from I hope to build one from scratch soon at MakeSpace Cambridge.

But for now I would like to report on my progress with the book scanner kit from I decided to buy the base kit and get the cameras, foot pedal and USB hub from Amazon.

The kit arrived in a nice, light and compact package. The purchase includes video chat support for the assembly, which I didn’t need due to the extensive online documentation.

DYIBookscanner parcel








Parts where each wrapped in shrink wrap.





















I will report on putting this together in a following post. For now I would like to congratulate the folks at for their excellent product.

Bioinformatics PhD opening for UK national

Towards a better understanding of lipid metabolism through studies of Drosophila Lipidomics

Christoph Steinbeck, Julian Griffin, Steve Russell

Fruit fly (Drosophila melanogaster, male), Courtesy of Max Westby

This project will bring together Drosophila experts at the University of Cambridge (Prof. Steve Russell, Dept of Genetics and Cambridge Systems Biology Centre), the Lipidomics lab led by Julian Griffin at the MRC Unit for Human Nutrition Research/Biochemistry, University of Cambridge and the Computational Metabolomics Group led by Christoph Steinbeck at the European Bioinformatics Institute in Hinxton to extensively study the lipidomics of Drosophila. The student will grow up a number of Drosophila mutant models related to lipid metabolism, isolate fat pads and perform analysis using high resolution mass spectrometry. The data will be modelled within the Computational Metabolomics Group to identify differences between the mutants, and explore how the Drosophila lipidome is regulated.

For this BBSRC-funded project, only candidates with a UK passport are eligible. Please email if you are interested.