Postdoc and PhD positions in Cheminformatics at Jena University, Germany

One postdoc position and three phd positions are available in my newly founded research group at Jena University, Germany.

I am currently moving from my previous position as Head of Cheminformatics and Metabolism at the European Bioinformatics Institute (EBI) to the Institute for Analytical Chemistry as Professor for Analytical Chemistry, Cheminformatics and Chemometrics at Jena University. The successful candidates will help forming the nucleus of the new research group and work in an exciting network of local and international collaborations, such as the PhenoMeNal project funded by the European Commission in their Horizon2020 framework program.

Open Positions:

  1. Postdoc: We are looking for a talented cheminformatician, bioinformatician or someone with comparable skills to work on the development cloud-based methods for computational metabolomics. The successful candidate will work closely with the H2020 e-infrastructure project PhenoMeNal, a European consortium of 14 partners. This position requires excellent skills in at least one modern, object-oriented programming language. A strong interest in metabolomics and cloud computing as well as the ability to work in a distributed team will be advantageous. The postdoc will also have the opportunity to participate in the day-to-day management of the group as well as in the organisation of seminars and practical courses for our students.
  2. PhD student, biomedical information mining: In this phd project the candidate will combine methods of text mining, image mining and cheminformatics to extract information about metabolites and natural products from the published primary literature. This includes opportunities to work with the OpenMinTed consortium, where we have been leading the biomedical use case in the last 1.5 years, as well as with the ContentMine team.
  3. PhD student, cheminformatic prediction of natural product structures: Depending on skills and interests of the successful candidate, this project can target the problem of structure prediction of natural products and metabolite from either the side of spectroscopic information which one might have about an unknown natural product or starting from the genome of a natural product producing organism. Two positions are available in this area.

All PhD positions require a strong interest in molecular informatics and current IT technologies, programming skills a modern object oriented programming language and the ability to work in geographically distributed teams.

Please send applications in PDF format by email to We will accept applications until the position is filled.

Background information:

The Friedrich Schiller University Jena (FSU Jena), founded in 1558, is one of the oldest universities in Europe and a member in the COIMBRA group, a network of prestigious, traditional European universities. The University of Jena has a distinguished record of innovations and resulting educational strengths in  major fields such as optics, photonics and optical technologies, innovative materials and related technologies, dynamics of complex biological systems and humans in changing social environments. It has more than 18,000 students. The university’s friendly and stimulating atmosphere and state-of-the-art facilities boost academic careers and enable excellence in learning, teaching and research. Assistance with proposing and inaugurating new research projects and with establishing public-private partnerships is considered a crucial point.

About Christoph Steinbeck


Postdoc in Bioinformatics for Anti-Obesity Strategy

Courtesy of Frank Genten

Courtesy of Frank Genten

The Steinbeck Group at European Bioinformatics Institute (EMBL-EBI, Cambridge, UK), together with the Lab of Tony Vidal-Puig (U. Cambridge/ WT Sanger), are excited to announce a joint opening for a Post-doc position to work on a multi-omics project to identify new players in the human brown and beige adipocyte recruitment as an anti-obesity strategy.  The project involves data analysis of multi-omics data sets (metabolomics, transcriptomics, proteomics, among others) and integration of that data into different mathematical modelling frameworks (discrete logical models, kinetic ODE-based, FBA). With these models and data, the fellow will identify novel pharmaceutical strategies to induce BAT generation/WAT browning. The models will be used to evaluate in silico the potential effect of drugs on adipocytes. Finally, the best candidate molecules will be applied to the human pluripotent stem cell models to confirm their capacity to induce brown/beige adipogenesis in-vitro. Experiments will be performed with the support of experts at WTSI.

Experience in applying mathematical modelling techniques is desirable, as well as previous exposure to large data sets, but it is not expected of course that the candidate has expertise in all of the listed above, as training will be given on parts were the applicant has less experience. 

The EMBL-EBI is part of the European Molecular Biology Laboratory (EMBL) and it is a world-leading bioinformatics centre providing biological data to the scientific community with expertise in data storage, analysis and representation. EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in the academic and industrial sectors.

EMBL-EBI and Wellcome Trust Sanger Institute share the Wellcome Genome Campus. This proximity fosters close collaborations and contributes to an international and vibrant campus environment. Researchers are supported by easy access to scientific expertise, well-equipped facilities and an active seminar programme.

The EMBL-EBI–Sanger Postdoctoral (ESPOD) Programme builds on the strong collaborative relationship between the two institutes, offering projects which combine experimental (wet lab) and computational approaches.

Please apply here:

Internship in Bioinformatics/Cheminformatics

CaffeineWe are looking for a candidate for an internship/trainee position in bioinformatics/cheminformatics at the European Bioinformatics Institute (EMBL-EBI) to work on a high-performance generator for chemical structures. This position requires strong programming skills in Java, a reasonable working knowledge in chemical structures and graph theory as well as an interest in learning about Apache Hadoop and related technologies.
The initial contract will be for 6 month with a monthly internship salary of £800.

Please send your application to

Open Position for Bioinformatician/Ontologist

We are seeking to recruit an experienced Bioinformatician/Ontologist to work on the eNanoMapper EU project. This project is building a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs) based on open standards, ontologies and an interoperable design to enable a more effective, integrated approach to European research in nanotechnology. You will join the Cheminformatics & Metabolism team at the European Bioinformatics Institute (EMBL-EBI) located at the Wellcome Genome Campus near Cambridge in the UK.

Our group is leading work package 2 of this project, which is developing an ontology for the full domain of nanosafety research, based on existing ontologies and using the standard Semantic Web ontology language OWL. You will work with the eNanoMapper partners on ontology software development and editing, addressing the requirements outlined by the consortium.

The EBI is part of the European Molecular Biology Laboratory (EMBL) and it is a world-leading bioinformatics centre providing biological data to the scientific community with expertise in data storage, analysis and representation. EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academic and industry.

Qualifications and Experience

You will be a capable software engineer, able to work comfortably in Java, and be familiar with the OWL language and associated APIs such as the OWL API. You furthermore need to be familiar with version control (GitHub) and continuous integration systems (Jenkins). You will also need to be able to use the Protégé ontology editing tool to create ontology content according to user requirements.

In addition to your technical expertise some familiarity with the biomedical domain is required; domain expertise in an area closely related to nanosafety would be a bonus.

You will have also experience with:
• Maven
• Batch/Script/Shell programming

You will have a strong interest in ontologies, solid technical skills, and an outgoing, collaborative personality. You must be able to work as part of a team but at the same time be self-driven, trustworthy and able to make progress towards objectives independently. You must also be meticulous and careful, with good communication skills, and willing to travel to European project meetings.


EMBL is an inclusive, equal opportunity employer offering attractive conditions and benefits appropriate to an international research organisation. The remuneration package comprises a competitive salary, a comprehensive pension scheme and health insurance, educational and other family related benefits where applicable, as well as financial support for relocation and installation.

We provide a dynamic, international working environment and have close ties with both the University of Cambridge and the Wellcome Trust Sanger Institute.
EMBL-EBI staff also enjoy excellent sports facilities, a free shuttle bus to Cambridge and other nearby centres, an active sports and social club and an attractive working environment set in 55 acres of parkland.
The initial contract is for a period of 1 year and 4 months with the possibility of a fixed-term extension.

Application Instructions

We welcome applications irrespective of gender and appointment will be based on merit alone. Applications are welcome from all nationalities – visa information will be discussed in more depth with applicants selected for interview.
Please apply online through

Open Positions in PhenoMeNal project (Metabolomics e-Infrastructure)

European Bioinformatics Institute – South Building

We are seeking to recruit a number of experienced people for our PhenoMeNal H2020 grant, to create an international e-infrastructure for large-scale computing and analysis of medical metabolomics data. You will join the Cheminformatics and Metabolism team at the European Bioinformatics Institute (EMBL-EBI) located at the Wellcome Genome Campus near Cambridge in the UK.

We are looking for a full-time project manager, a senior bioinformatician and a software engineer.

This is a project with a budget of €8M, distributed over 13 partners with about 830 person-months / 65 deliverables scheduled over a 3-year period. The PhenoMeNal project will develop and deploy an integrated, secure, permanent, on-demand service-driven, privacy-compliant and sustainable e-infrastructure for the processing, analysis and information-mining of the massive amount of medical molecular phenotyping and genotyping data that will be generated by metabolomics applications now entering research and clinic.

The EBI is part of the European Molecular Biology Laboratory (EMBL) and it is a world-leading bioinformatics centre providing biological data to the scientific community with expertise in data storage, analysis and representation. EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academic and industry.

Please apply via the EMBL online recruiting system following the individual links for the full-time project manager, the senior bioinformatician and the software engineer.


Job offer: Bioinformatician in Metaspace project for untargeted spatial metabolomics

We are looking to recruit a talented bioinformatician to work within the MetaboLights team at the European Bioinformatics Institute (EMBL-EBI) located on the Wellcome Trust Genome Campus near Cambridge in the UK. You will work closely with a consortium of 8 European partners on the METASPACE project.
METASPACE will enable untargeted spatial metabolomics for translational research and clinical applications by providing novel bioinformatics tools, and to demonstrate their potential using several case studies relating to personalised health, precision medicine and quality of life in chronic afflictions.
You will work with other bioinformaticians, domain experts and software engineers on the development of novel database-driven spectral and spatial algorithms, machine learning approach for multiple-mass fingerprinting, development of web services and more.  You will also help coordinating outreach and training, both in terms of online and face-to-face training, in collaboration wit EMBL-EBI’s professional outreach and training team.
The EBI is part of the European Molecular Biology Laboratory (EMBL) and it is a world-leading bioinformatics centre providing biological data to the scientific community with expertise in data storage, analysis and representation. EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academic and industry. We are part of EMBL, Europe’s flagship laboratory for the life sciences.
Please submit your application via the EMBL job site.

nmrML: A vendor-neutral open exchange format for NMR-based metabolomics

Metabolomics is a growing field and the number of organisms being studied is constantly increasing as is the number of metabolites being discovered. With this growth comes a steady increase in the amount of metabolomics data and the need to ensure that we capture this information in persistent open formats and databases.

The COSMOS project (COordination Of Standards In MetabOlomicS, has been created to improve the adoption of open standards for metabolomics data, annotation with agreed metadata, and support by open source data management and capturing tools. COSMOS delivers an ecosystem of formats, tools, and resources such as MetaboLights (, a database for capturing information obtained in metabolomics experiments.

In a spotlight in the December 2014 issue of MetaboNews we introduce recent developments in nmrML (, a vendor-neutral open exchange and storage format to describe NMR-based metabolomics data.

Structure Elucidation of Unknown Metabolites (2) – De Novo vs Look-Up

There has always been a bit of confusion in the terminology around the subject of  computer-assisted structure elucidation (CASE), so let’s define some terms:

  1. Structure Elucidation – Determining the structure of truly unknown compounds de-novo from spectroscopic (or earlier by hard core chemistry :)) experiments
  2. Structure Identification – Recovering the structure of already known compounds from databases or printed reference material (including experimental sections of the primary literature)
  3. Dereplication: Same as 2.

The more we discover, the more likely it will be that we are able to de-replicate by database lookup. This of course requires well curated and developed open access databases that cover many chemical compounds/metabolites.

In organic chemistry, spectroscopic databases for structure identification where published quite early, albeit as closed-access, commercial systems. The most widely used examples is probably the SpecInfo database which now seems to be marketed by Wiley and the more recently (considering the 40-year horizon of the topic :)) published ACD/Labs spectral libaries and management system. Wolfgang Robien in Vienna has been developing NMR spectral databases and prediction tools for a long time.

The general way of searching in such databases would be to measure an NMR spectrum of your isolated unknown compound, perform a peak picking and search the database using this peak picking (a feature vector, if you wish).

In the early 2000’s my Stephan Kuhn in my group developed the NMRShiftDB database which was the first open access, open source, open submission, web-based NMR database where you can now test how this all works without running into pay walls. Stephan has left the lab and now runs version 2 of this database in collaboration with the NMR lab at the department of chemistry at the University of Cologne.

One caveat: It is much easier to search for carbon-13 NMR spectra or mass spectra than for proton NMR spectra. The latter has rarely been addressed, not the least because of the lack of full spectrum proton data to which you could match a real-life proton spectrum. Peak-picking proton NMR spectra is problematic often due to overlap and complex coupling patterns.

Take for example the carbon-13 spectrum of pinocarveol, both from the metabolomics section of BioMagResBank (BMRB). Using your NMR software’s peak picking method, you would end up with this list of NMR signals. If you have a decent browser, such as FireFox, you can use the CMD (on mac) or CTRL key and select the chemical shifts in the table linked above. If not, here they are:






Structure Elucidation of Unknown Metabolites (1) – Problem Description

In preparation for a talk about structure elucidation of unknown chemical compounds at AISBM in Paris in October, I laying out a bit of the work that others as well as my group have done over the past 40 years in this area. The topic of this AISBM meeting is “Challenges and advances in the annotation and de novo identification of small molecules of biological origin”. I am going to address the problem of computer-assisted structure elucidation (CASE) of unknown compounds in organic chemistry in general and refer to the sub-problem of natural products and metabolites when needed.

Generally speaking, we are talking about the problem where you have an evidence that there is a compound in a biological system or your flask but you don’t know the structure of it. By structure, I mean ideally the fully defined stereo isomer but at least the fully defined constitutional isomer.

I have reviewed the problem of computer-assisted structure elucidation (CASE) a few times in the context of Natural Products Structure Elucidation (Steinbeck 2001, Steinbeck 2004).

Assume, for example, that you are working on a newly discovered medicinal plant and want to discover the compound or set of compounds responsible for the activity.

Willow Tree (Courtesy of Wikipedia). The bark contains Salicylic Acid and was used for a long time as a pain reliever and fever reliever.

How  could one find out that the structure of the active ingredient is as follows?

Constitutional formula of salicylic acid (Courtesy of Wikipedia)

The evidence you’ve got could come from a chromatographic experiment where you have become interested in a particular peak that shows a biological activity.

HPLC chromatogram of a perfume mixture (courtesy of Wikipedia). Each of those peaks is at least one compound. What is the structure of the compound under the leftmost signal?

In our context, the information for determining this information comes from spectroscopic information – NMR and/or Mass Spectrometry (MS). In order to retrieve this information, we need to isolate the compound using separation techniques such as HPLC or use hyphenation techniques.

In the next post, I will elaborate on some of case scenarios that we might be facing in structure elucidation.


Steinbeck, C. “The Automation of Natural Product Structure Elucidation..” Current Opinion in Drug Discovery and Development 4.3 (2001): 338–342. Print.

Steinbeck, C. “Recent Developments in Automated Structure Elucidation of Natural Products.” Natural Product Reports 21.4 (2004): 512–518. Print.