We now have an opening for a postdoc position in my group in collaboration with the group of Dietrich Rebholz-Schuhmann at the EBI. The position is funded for three years by the EIPOD scheme at EMBL.
The proposed work combines methods from image recognition (OSRA, Filippov2009), cheminformatics (CDK ,Steinbeck2003), chemometrics (Gkoutos2003) and text-mining (OSCAR3, Corbett2006, Tiago2009) to extract information relevant to small molecules from the primary literature. The project will deliver methods to discover information about chemical entities linked to their chemical structures and their assigned spectra. The research focus lies on the cross-validation of the extracted information against cheminformatics prediction methods to compensate error propagation and to benchmark prediction methods on published data.
You’ll find a one-page project description on the EIPOD page, together with information on how to apply. If you are interested and have questions, feel free to contact me at steinbeck [at] ebi.ac.uk.
Corbett and Murray-Rust. High-throughput identification of chemistry in life science texts. LCNS (2006), 1611-3349
Filippov and Nicklaus. Optical Structure Recognition Software (OSRA). J. Chem. Inf. Model (2009), 49(3), 740–743
Gkoutos GV et al. Chemical Machine Vision. (2003) 43:1342–1355.
Grego T et al. Identification of Chemical Entities in Patent Documents. In: LNCS (2009) 5518:942-949
Guha et al. The Blue Obelisk – Interoperability in Chemical Informatics. J Chem Inform Model (2005) 46(3):991-998
Steinbeck et al. The Chemistry Development Kit (CDK). J chem inform comp sci (2003) 43(2): 493-500