Breaking News: Open access to large-scale drug discovery data at EBI

Very exciting things have just happened here at EBI in the area of chemoinformatics and drug discovery:

The Wellcome Trust has awarded £4.7 million to the European Bioinformatics Institute (EBI) to support the transfer of a large collection of information on the properties and activities of drugs and a large set of drug-like small molecules from BioFocus DPI, part of the publicly listed company Galapagos to the public domain.

Here are the press releases of EBI and Galapagos.

The databases will be incorporated into EBI’s collection of open-access data resources for biomedical research and will be maintained by a newly established team of scientists at the EMBL-EBI. These data lie at the heart of translating information from the human genome into successful new drugs in the clinic.

The databases to be brought into the public domain include DrugStore™ (database of known drugs), StARLITe™ (database of known compounds and their effects), Strudle™ (binding site drugability), and Kinase SARfari™ and GPCR SARfari™ (informatics systems for the most widely used target classes in drug discovery).
The main database, StARLITe, on Drug-Target interactions alone has hundreds of thousands of interaction data points manually curated from the medicinal chemistry literature.

A new team leader will be appointed to support the new resource and my group will provide the chemoinformatics expertise to move the underlying analysis software into the open source world. The Chemistry Development Kit (CDK) will play a major role both in freeing the QSAR code as well as in providing (sub-) structure and similarity searching to the database.

The transfer will empower academia to participate in the first stages of drug discovery for all therapeutic areas, including major diseases of the developing world. In future it could also result in improved prediction of drug side-effects and spark all kinds of new academic research directions.

We are thrilled 🙂

Categorised as: Open Science


  1. Rich Apodaca says:

    Chris, this is great news. What do you think will be your group’s top two or three (cheminformatics) challenges?

  2. Rajarshi Guha says:

    Fantastic news!

  3. Rich, in the context of this grant, I think it will be sophisticated and fast structure searching in all its facets (fingerprinting, Markush, similarity).

  4. Duncan Hull says:

    Chris, there’s some response to this news over at friendfeed, you might be interested. Lots of people “liked” it of course, great news!

  5. Michael Kuhn says:

    Cool! Do you have a time-line for when all of this data will become available?

  6. Christoph, that’s really great news and I am looking forward to working with the data! It’s amazing that very soon people in academia will have access to this really well-curated database, too. Please keep us posted on new developments! Best wishes – Andreas

  7. I was pointed to this article as a result of my blogposting…I thought that Galapagos had “contributed” the data. I didn’t realize that there was a payment of 1.8 million EUR.{68EF3A75-E2B8-46A0-8B1F-3D0986355021}

  8. Antony, yes, they didn’t do it for a bag of thin air 🙂
    Cheers, C.

  9. Michael, the time line is not clear – too many unknowns – but we’ll go for ASAP 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *