<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Faster Fingerprints for the CDK</title>
	<atom:link href="http://www.steinbeck-molecular.de/steinblog/index.php/2008/10/07/faster-fingerprints-for-the-cdk/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/10/07/faster-fingerprints-for-the-cdk/</link>
	<description>A molecular informatics weblog</description>
	<lastBuildDate>Thu, 02 Sep 2010 04:39:41 +0200</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Mark Rijnbeek</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/10/07/faster-fingerprints-for-the-cdk/comment-page-1/#comment-1838</link>
		<dc:creator>Mark Rijnbeek</dc:creator>
		<pubDate>Thu, 09 Oct 2008 11:31:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=58#comment-1838</guid>
		<description>To be clear, that 187s was for the MACCSFingerprinter</description>
		<content:encoded><![CDATA[<p>To be clear, that 187s was for the MACCSFingerprinter</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark Rijnbeek</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/10/07/faster-fingerprints-for-the-cdk/comment-page-1/#comment-1837</link>
		<dc:creator>Mark Rijnbeek</dc:creator>
		<pubDate>Thu, 09 Oct 2008 11:28:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=58#comment-1837</guid>
		<description>First of all sorry about a my mistake on information regarding the timing - times in ms are absolute, not relative to start time.  So the calculation of fingerprints was 64s for the old code, 9.9 for the new. That&#039;s about 6 times faster. 
I re-ran the class today and got the same factor out, although timings were different (overall slower) than above. 

There doesn&#039;t really seem to be a performance difference between NNMolecule and Molecule. On my computer it takes around 1.3-1.5 seconds to build a list of a 1000 for both classes.

The benchmark was already done using the extended fingerprinter.  I now also measured the MACCSFingerprinte, but only had that class available in the old code; the cdk104 jar file does not seem to contain it.

It took 187 seconds to calculate 1000 fingerprints, so pretty slow compare to the extended fingerprinter in the original post.
0 - Start benchmark 1000 compounds.
111 - Fingerprinter set up
341 - Connected to database
124 - Resultset opened 
817 - Molfile strings retrieved from database, stored in list 
1597 - Molecule objects list built 
187210 - Fingerprints calculated</description>
		<content:encoded><![CDATA[<p>First of all sorry about a my mistake on information regarding the timing &#8211; times in ms are absolute, not relative to start time.  So the calculation of fingerprints was 64s for the old code, 9.9 for the new. That&#8217;s about 6 times faster.<br />
I re-ran the class today and got the same factor out, although timings were different (overall slower) than above. </p>
<p>There doesn&#8217;t really seem to be a performance difference between NNMolecule and Molecule. On my computer it takes around 1.3-1.5 seconds to build a list of a 1000 for both classes.</p>
<p>The benchmark was already done using the extended fingerprinter.  I now also measured the MACCSFingerprinte, but only had that class available in the old code; the cdk104 jar file does not seem to contain it.</p>
<p>It took 187 seconds to calculate 1000 fingerprints, so pretty slow compare to the extended fingerprinter in the original post.<br />
0 &#8211; Start benchmark 1000 compounds.<br />
111 &#8211; Fingerprinter set up<br />
341 &#8211; Connected to database<br />
124 &#8211; Resultset opened<br />
817 &#8211; Molfile strings retrieved from database, stored in list<br />
1597 &#8211; Molecule objects list built<br />
187210 &#8211; Fingerprints calculated</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Egon Willighagen</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/10/07/faster-fingerprints-for-the-cdk/comment-page-1/#comment-1827</link>
		<dc:creator>Egon Willighagen</dc:creator>
		<pubDate>Wed, 08 Oct 2008 08:46:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=58#comment-1827</guid>
		<description>Oh, and rather interesting, of course... I&#039;d welcome some statistics on the filtering success of the fingerprints on the database... Maybe compare the path-based-only fingerprint with Stefan&#039;s extended fingerprint... and, rather interesting, with Rajarshi&#039;s cdk.fingerprint.MACCSFingerprinter...</description>
		<content:encoded><![CDATA[<p>Oh, and rather interesting, of course&#8230; I&#8217;d welcome some statistics on the filtering success of the fingerprints on the database&#8230; Maybe compare the path-based-only fingerprint with Stefan&#8217;s extended fingerprint&#8230; and, rather interesting, with Rajarshi&#8217;s cdk.fingerprint.MACCSFingerprinter&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Egon Willighagen</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/10/07/faster-fingerprints-for-the-cdk/comment-page-1/#comment-1826</link>
		<dc:creator>Egon Willighagen</dc:creator>
		<pubDate>Wed, 08 Oct 2008 08:22:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=58#comment-1826</guid>
		<description>Mark, could you please also compare using Molecule and NNMolecule? I would expect the difference between the old code and CDK 1.0.4 to be smaller when using NNMolecule. The latter does not use IChemObjectListeners, which you do not now anyway.

Even if the ratio old/new stays the same, I guess the 7 seconds still goes down to 5 or 6 using NNMolecule.</description>
		<content:encoded><![CDATA[<p>Mark, could you please also compare using Molecule and NNMolecule? I would expect the difference between the old code and CDK 1.0.4 to be smaller when using NNMolecule. The latter does not use IChemObjectListeners, which you do not now anyway.</p>
<p>Even if the ratio old/new stays the same, I guess the 7 seconds still goes down to 5 or 6 using NNMolecule.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
