<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SteinBlog &#187; Open Standards</title>
	<atom:link href="http://www.steinbeck-molecular.de/steinblog/index.php/category/open-standards/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.steinbeck-molecular.de/steinblog</link>
	<description>A molecular informatics weblog</description>
	<lastBuildDate>Mon, 28 Jun 2010 13:42:20 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>CDK-Taverna paper published</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2010/03/29/cdk-taverna-paper-published/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2010/03/29/cdk-taverna-paper-published/#comments</comments>
		<pubDate>Mon, 29 Mar 2010 07:31:36 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Chemistry Development Kit]]></category>
		<category><![CDATA[Chemoinformatics]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[CDK-Taverna]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=320</guid>
		<description><![CDATA[We are glad to announce that our article about CDK-Taverna, an open workflow solution for cheminformatics, is now online on BMC Bioinformatics. CDK-Taverna, which lives at http://www.cdk-taverna.de/, features more than 160 workers for various tasks in molecular informatics.
The workflow paradigm allows scientists to flexibly create generic workflows using different kinds of data sources, filters and [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_321" class="wp-caption alignleft" style="width: 240px"><img class="size-medium wp-image-321" title="Screen shot 2010-03-29 at 09.27.10" src="http://www.steinbeck-molecular.de/steinblog/wp-content/uploads/2010/03/Screen-shot-2010-03-29-at-09.27.10-230x299.png" alt="CDK-Taverna workflow" width="230" height="299" /><p class="wp-caption-text">CDK-Taverna workflow</p></div>
<p>We are glad to announce that our article about <a href="http://www.cdk-taverna.de/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.cdk-taverna.de');" target="_blank">CDK-Taverna</a>, an open workflow solution for cheminformatics, is now <a href="http://www.biomedcentral.com/1471-2105/11/159/abstract" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.biomedcentral.com');" target="_blank">online on BMC Bioinformatics</a>. <a href="http://www.cdk-taverna.de/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.cdk-taverna.de');" target="_blank">CDK-Taverna</a>, which lives at <a href="http://www.cdk-taverna.de/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.cdk-taverna.de');" target="_blank">http://www.cdk-taverna.de/</a>, features more than 160 workers for various tasks in molecular informatics.</p>
<p>The workflow paradigm allows scientists to flexibly create generic workflows using different kinds of data sources, filters and algorithms, which can later be adapted to changing needs. In order to achieve this, library methods are encapsulated in Lego(TM)-like building blocks which can be manipulated with a mouse or any pointing device in a graphical environment, relieving the scientist from the need to learn a programming language. Building blocks, so-called <em>workers</em>, are connected by data pipelines to enable data flow between them, which is why pipelining is often used interchangeably for workflow.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2010/03/29/cdk-taverna-paper-published/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>ChEBI chemistry ontology development funded by BBSRC</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2009/05/19/chebi-chemistry-ontology-development-funded-by-bbsrc/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2009/05/19/chebi-chemistry-ontology-development-funded-by-bbsrc/#comments</comments>
		<pubDate>Tue, 19 May 2009 11:30:45 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[ChEBI]]></category>
		<category><![CDATA[Chemoinformatics]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[Ontologies]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/index.php/2009/05/19/chebi-chemistry-ontology-development-funded-by-bbsrc/</guid>
		<description><![CDATA[We received our official award letter from BBSRC Tools and Resources Fund today for the ChEBI ontology development grant. Needless to say, we are thrilled. We are now going to work together with Michael Ashburner&#8217;s group at the University of Cambridge to align ChEBI with other OBO Foundry ontologies by adoption of the Basic Formal [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://farm2.static.flickr.com/1230/1357630710_37fa30c2be_m.jpg" onclick="javascript:pageTracker._trackPageview('/outbound/article/farm2.static.flickr.com');"><img class="alignleft" src="http://farm2.static.flickr.com/1230/1357630710_37fa30c2be_m.jpg" alt="" width="150" height="150" /></a>We received our official award letter from BBSRC Tools and Resources Fund today for the ChEBI ontology development grant. Needless to say, we are thrilled. We are now going to work together with Michael Ashburner&#8217;s group at the University of Cambridge to align ChEBI with other OBO Foundry ontologies by adoption of the Basic Formal Ontology and the Relationship Types Ontology.<br />
This will include extensive annotation of the ChEBI ontology required after adoption of BFO and RO. The adoption of the BFO will require a major reorganisation of the upper levels of the ChEBI ontology in order to allow it to align to the BFO. This<br />
reorganisation can only be achieved by manual annotation although some semi-automatic means will be employed to aidthe curator. In addition to the reorganisation of the upper levels, new relationships will be introduced semi-automatically but as the ChEBI ethos requires that all data is manually checked to maintain ChEBI&#8217;s high standards of data quality, we expect a major annotation task. The project is funded for three years. Stay tuned. We&#8217;ll report on our progress on a regular basis.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2009/05/19/chebi-chemistry-ontology-development-funded-by-bbsrc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ChEBI behind the scenes</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2009/05/08/chebi-behind-the-scenes/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2009/05/08/chebi-behind-the-scenes/#comments</comments>
		<pubDate>Fri, 08 May 2009 08:12:37 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[ChEBI]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[Curation]]></category>
		<category><![CDATA[Ontologies]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=179</guid>
		<description><![CDATA[With ChEBI release 56 behind us, I thought I&#8217;d share some insight into how ChEBI is created and what we do to prepare a release. In the last years, the ChEBI team on average consisted of two software engineers maintaining and improving the software and two to three curators doing the data entry and curation. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://farm4.static.flickr.com/3097/2755976719_3a4161cd15_m.jpg" onclick="javascript:pageTracker._trackPageview('/outbound/article/farm4.static.flickr.com');"><img class="alignleft" title="Preparation" src="http://farm4.static.flickr.com/3097/2755976719_3a4161cd15_m.jpg" alt="" width="240" height="240" /></a>With <a href="http://www.steinbeck-molecular.de/steinblog/index.php/2009/04/30/chebi-release-56-now-with-sd-file/"  target="_blank">ChEBI release 56</a> behind us, I thought I&#8217;d share some insight into how ChEBI is created and what we do to prepare a release. In the last years, the ChEBI team on average consisted of two software engineers maintaining and improving the software and two to three curators doing the data entry and curation. It is remarkable, that, by now, the question of which chemical compounds make it into ChEBI is completely community driven. Requests to enter compounds are submitted by users and other database maintainers via the <a href="https://sourceforge.net/tracker/?group_id=125463&amp;atid=703818" onclick="javascript:pageTracker._trackPageview('/outbound/article/sourceforge.net');" target="_blank">ChEBI curator request tracker on SourceForge</a>. Besides increasing the public knowledge of mankind, the biggest benefit and driving force for submitters is the assignement of a stable ChEBI identifier which then can be cited and linked to from other resources.</p>
<p>With <a href="http://www.ebi.ac.uk/chebi/newsForward.do#ChEBI%20Release%2055" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ebi.ac.uk');" target="_blank">ChEBI release 55</a> we have introduced the new submission tool which now allows our submitter to create ChEBI datasets themselves which a) gives our users more control over what they want to see in ChEBI and b) saves our curators some duplicate work.</p>
<p>In preparation for a release, here is what the ChEBI team does.</p>
<ul>
<li>Create automatic cross-references to PubChem, UniProt, IntEnz, BRENDA, SABIO-RK, ArrayExpress, IntAct, Patents etc&#8230;These are all run a week before the release and are based on ChEBI identifier matching or text matching.</li>
<li>Annotation of entity of the month</li>
<li>Submissions deposited directly into the database by users are processed by our annotators.</li>
</ul>
<p>On the release day:</p>
<ul>
<li>Data is exported overnight into multiple formats, OBO format, SDF, Oracle data dumps and PostgreSQL/MySQL dumps.</li>
<li>Public web site updated with the entity of the month.</li>
<li><a href="http://www.ebi.ac.uk/chebi/statisticsForward.do" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ebi.ac.uk');" target="_blank">Statistics</a> generated and stored.</li>
<li>Sitemaps are generated to be used by search engines like Google for indexing.</li>
<li>Finally data is deposited into PubChem and the EB-eye search engine is updated.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2009/05/08/chebi-behind-the-scenes/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>3rd International Biocuration Conference in Berlin</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2009/04/17/3rd-international-biocuration-conference-in-berlin/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2009/04/17/3rd-international-biocuration-conference-in-berlin/#comments</comments>
		<pubDate>Fri, 17 Apr 2009 07:24:52 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Conferences and Meetings]]></category>
		<category><![CDATA[Life of Chris]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[People]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Scientific Culture]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=123</guid>
		<description><![CDATA[I&#8217;m attending the 3rd International Biocuration Conference in Berlin, which looks like a pretty successful meeting in terms of numbers of participants. Seems like somewhere between 100 and 200 participants. It looks like the time for recognition for biocuration and curated biological resource has come. The International Society for Biocuration has been inaugurated yesterday. People [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_124" class="wp-caption alignright" style="width: 160px"><a href="http://www.flickr.com/photos/richardneilson/243850051/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.flickr.com');"><img class="size-thumbnail wp-image-124" title="243850051_a8d53388ee_m" src="http://www.steinbeck-molecular.de/steinblog/wp-content/uploads/2009/04/243850051_a8d53388ee_m-150x150.jpg" alt="Berlin Dahlem-Dorf tube station" width="150" height="150" /></a><p class="wp-caption-text">Berlin Dahlem-Dorf tube station</p></div>
<p>I&#8217;m attending the <a href="http://projects.eml.org/sdbv/events/BiocurationMeeting/index.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/projects.eml.org');" target="_blank">3rd International Biocuration Conference</a> in Berlin, which looks like a pretty successful meeting in terms of numbers of participants. Seems like somewhere between 100 and 200 participants. It looks like the time for recognition for biocuration and curated biological resource has come. The <a href="http://biocurator.org/" onclick="javascript:pageTracker._trackPageview('/outbound/article/biocurator.org');" target="_blank">International Society for Biocuration</a> has been inaugurated yesterday. People from publishing companies such as Nature are attending.</p>
<p><a href="http://www.ebi.ac.uk/Thornton/group_members.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ebi.ac.uk');" target="_blank">Janet Thornton, director of EBI</a>, gave the opening keynote yesterday evening, rehearsing some of the history of biocuration and looking into the future of securing funding for biocuration through the <a href="http://www.elixir-europe.org" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.elixir-europe.org');" target="_blank">Elixir project</a>.</p>
<p>I&#8217;m now listening to <a href="http://www.sdsc.edu/pb/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.sdsc.edu');" target="_blank">Philip Bourne</a> talking about &#8220;Changes in Scholarly Communication and the Potential Impact on Biocuration&#8221;. He talks, beyond a lot of other things, about the author embedding semantic information into the orginal manuscript and introduces part of his own work with Microsoft on a plug-in for word to do this enrichment.</p>
<p>There is nothing overly particular about this meeting but it strenghens my feeling that we are at the point where finally the idea of preserving the information in the first place, in the scientific document, has come. Both <a href="http://www.ebi.ac.uk/Rebholz/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ebi.ac.uk');" target="_blank">Dietrich&#8217;s</a> <a href="http://www.ebi.ac.uk/Rebholz-srv/SESL/sesl.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ebi.ac.uk');" target="_blank">semantic enrichment conference</a> as well as this one was well attended by publishers &#8211; Elsevier and Nature where at both. This scientific document can then become both a scientific article as well as one or many database entries.</p>
<p>Another notion that has come up a couple of times is the question of reward for authors to make and submit semantically rich documents. One of the ideas is fast-tracking those documents &#8211; publishing them faster.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2009/04/17/3rd-international-biocuration-conference-in-berlin/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ChEBI at the Fall 2009 ACS meeting in Washington</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2009/03/27/chebi-at-the-fall-2009-acs-meeting-in-washington/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2009/03/27/chebi-at-the-fall-2009-acs-meeting-in-washington/#comments</comments>
		<pubDate>Fri, 27 Mar 2009 15:05:23 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Conferences and Meetings]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Life of Chris]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Scientific Culture]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=102</guid>
		<description><![CDATA[I&#8217;ve been invited to present our ChEBI ontology at the 2009 Fall Meeting of the American Chemical Society. Here is our abstract:
ChEBI &#8211; An open ontology for Chemical Entities of Biological Interest
Paula de Matos (1), Kirill Degtyarenko (2), Marcus Ennis (1), Janna
Hastings (1), Inma Spiteri (1) and Christoph Steinbeck (1)
(1) European Bioinformatics Institute, Hinxton, Cambridge, [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been invited to present our <a href="http://www.ebi.ac.uk/chebi" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ebi.ac.uk');" target="_blank">ChEBI ontology</a> at the <a href="http://portal.acs.org/portal/PublicWebSite/meetings/national/fall2009/index.htm" onclick="javascript:pageTracker._trackPageview('/outbound/article/portal.acs.org');" target="_blank">2009 Fall Meeting of the American Chemical Society</a>. Here is our abstract:</p>
<p>ChEBI &#8211; An open ontology for Chemical Entities of Biological Interest</p>
<p>Paula de Matos (1), Kirill Degtyarenko (2), Marcus Ennis (1), Janna<br />
Hastings (1), Inma Spiteri (1) and Christoph Steinbeck (1)</p>
<p>(1) European Bioinformatics Institute, Hinxton, Cambridge, UK<br />
(2) European Patent Office, The Hague, The Netherlands</p>
<p>Chemical Entities of Biological Interest (ChEBI) is a freely available, manually annotated resource providing data such as chemical nomenclature, an ontology and chemical structures. The ChEBI ontology imposes meaning onto the data according to four subontologies: molecular structure, application, biological role and subatomic particle. As a cheminformatics resource it provides chemical substructure and similarity searching using the Chemistry Development Kit (CDK). ChEBI annotates structures with various properties such as charge and mass and names including brand names and International Nonproprietary Name (INN). This extended coverage is complemented by manually annotated names appearing in Patents and Patent identifiers. In addition names can now appear in French, German, Latin and Spanish. Acting as a chemoinformatics portal to other bioinformatics resources, ChEBI has introduced automatically generated links to resources such as UniProtKB, IntAct, ArrayExpress, SABIO-RK or PubChem. ChEBI lives at http://www.ebi.ac.uk/chebi/, where it is also available for download in<br />
a variety of formats and accessible via webservices.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2009/03/27/chebi-at-the-fall-2009-acs-meeting-in-washington/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Industry-funded medical research will double your impact factor</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2009/02/16/industry-funded-medical-research-will-double-your-impact-factor/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2009/02/16/industry-funded-medical-research-will-double-your-impact-factor/#comments</comments>
		<pubDate>Mon, 16 Feb 2009 09:00:52 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Open Access]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Scientific Culture]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=70</guid>
		<description><![CDATA[The Guardian has a nice piece by Ben Goldarcre reporting about a study published by the British Medial Journal entitled &#8220;Relation of study quality, concordance, take home message, funding, and impact in studies of influenza vaccines: systematic review&#8221;. Both the newpaper article and the study are worth reading and seem to be open. Besides many [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.guardian.co.uk/commentisfree/2009/feb/14/bad-science-medical-research" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.guardian.co.uk');" target="_blank">The Guardian has a nice piece by Ben Goldarcre</a> reporting about <a href="http://www.bmj.com/cgi/content/abstract/338/feb12_2/b354" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.bmj.com');" target="_blank">a study published by the British Medial Journal</a> entitled &#8220;Relation of study quality, concordance, take home message, funding, and impact in studies of influenza vaccines: systematic review&#8221;. Both the newpaper article and the study are worth reading and seem to be open. Besides many other interesting findings, the BMJ article finds that the <a href="http://en.wikipedia.org/wiki/Journal_impact_factor" onclick="javascript:pageTracker._trackPageview('/outbound/article/en.wikipedia.org');" target="_blank">journal impact factor</a> of industry-funded studies of influenza vaccines (both Ben and I find it quite likely that this is not limited to the study of influenza vaccines <img src='http://www.steinbeck-molecular.de/steinblog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> ) are on average more than twice as high as those for purely academic studies (Impact Factor 3.74 vs 8.78). Judge yourself.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2009/02/16/industry-funded-medical-research-will-double-your-impact-factor/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating and Reviewing Patches in the Chemistry Development Kit (CDK)</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/09/01/creating-and-reviewing-patches-in-the-chemistry-development-kit-cdk/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2008/09/01/creating-and-reviewing-patches-in-the-chemistry-development-kit-cdk/#comments</comments>
		<pubDate>Mon, 01 Sep 2008 17:13:34 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Chemistry Development Kit]]></category>
		<category><![CDATA[Life of Chris]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[Scientific Culture]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=56</guid>
		<description><![CDATA[In order to prevent major turbulences in the main source code development line of the Chemistry Development Kit (CDK), we decided a while ago to have separate branches in our subversion source code management system for each developer and each of his subprojects. Once a project has been finalized by a developer in her branch, [...]]]></description>
			<content:encoded><![CDATA[<p>In order to prevent major turbulences in the main source code development line of the Chemistry Development Kit (CDK), we decided a while ago to have separate branches in our subversion source code management system for each developer and each of his subprojects. Once a project has been finalized by a developer in her branch, she would then publish a patch in the <a href="https://sourceforge.net/tracker/?group_id=20024&amp;atid=320024" onclick="javascript:pageTracker._trackPageview('/outbound/article/sourceforge.net');" target="_blank">CDK patch tracker system</a> and ask for it to be reviewed by posting to the <a href="https://sourceforge.net/mailarchive/forum.php?forum_name=cdk-devel" onclick="javascript:pageTracker._trackPageview('/outbound/article/sourceforge.net');" target="_blank">CDK developers mailing list.</a> A CDK senior developer would the assign the patch to himself or another senior developer.</p>
<p>I have just been assigned the task to review the recent Iterator/Iterable patch for CDK and will protocol my task for reference reasons.The patch was <a href="https://sourceforge.net/tracker/index.php?func=detail&amp;aid=2040231&amp;group_id=20024&amp;atid=320024" onclick="javascript:pageTracker._trackPageview('/outbound/article/sourceforge.net');" target="_blank">published on the CDK patch tracker.</a></p>
<p>The executive summary of the reviewing task goes like:</p>
<ol>
<li>browse the code</li>
<li>mark up code you think is buggy</li>
<li>note missing unit tests</li>
<li>note missing JavaDoc</li>
<li>warn for subjected PMD warnings</li>
<li>optionally note other problems</li>
<li>optionally any other comment you have</li>
</ol>
<p>So, let&#8217;s see how it went:</p>
<p><strong>Browse the Code</strong></p>
<p>I got the gzipped archive with Egon&#8217;s patch and looked at the code. A large part of the changes involve</p>
<p>removing       <code>public Iterator&lt;IIsotope&gt; isotopes() {</code><br />
and adding    <code>public Iterable&lt;IIsotope&gt; isotopes() {<br />
</code><br />
to enable things like</p>
<p><code>double overallCharge = 0.0<br />
for (IAtom atom : molecule.atoms()) {<br />
overallCharge += atom.getCharge();<br />
}</code></p>
<p>In order to implement Iterable, one needs to have methods returning an Iterator, so a lot of code essentially implements those.</p>
<p>Remove:       <code>public java.util.Iterator atoms() {</code><br />
and add:      <code>public Iterable&lt;IAtom&gt; atoms() {<br />
logger.debug("Getting atoms iterator");<br />
return super.atoms();<br />
}</code></p>
<p>And then there is code actually using those iterators and all of these instances had to be adapted too (I&#8217;m just giving the patch syntax):</p>
<p><code>for(IReactionScheme rm : scheme.reactionSchemes()){<br />
-                       for(Iterator&lt;IAtomContainer&gt; iter = getAllMolecules(rm, molSet).atomContainers(); iter.hasNext(); ){<br />
-                       IAtomContainer ac = iter.next();<br />
-                       boolean contain = false;<br />
-                       for(Iterator&lt;IAtomContainer&gt; it2 = molSet.molecules();it2.hasNext();){<br />
-                               if(it2.next().equals(ac)){<br />
-                               contain = true;<br />
-                               break;<br />
-                       }<br />
-                       }<br />
-                       if(!contain)<br />
-                               molSet.addMolecule((IMolecule)(ac));<br />
-                       }<br />
+                for (IAtomContainer ac : getAllMolecules(rm, molSet).atomContainers()) {<br />
+                    boolean contain = false;<br />
+                    for (IAtomContainer atomContainer : molSet.molecules()) {<br />
+                        if (atomContainer.equals(ac)) {<br />
+                            contain = true;<br />
+                            break;<br />
+                        }<br />
+                    }<br />
+                    if (!contain)<br />
+                        molSet.addMolecule((IMolecule) (ac));<br />
+                }<br />
</code></p>
<p>Overall, the patch affected 288 classes including test classes, with almost 2000 lines of code changed.</p>
<p><strong>Mark up code you think is buggy</strong></p>
<p>Impossible to do for me for such a large bunch of changes, so one must rely here on the unit tests to work.</p>
<p><strong>Note missing unit tests</strong></p>
<p><a href="http://chem-bla-ics.blogspot.com/2007/11/comparing-junit-test-results-between.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/chem-bla-ics.blogspot.com');" target="_blank">Egon had posted some notes about comparing failing and passing between unit tests</a> earlier but we also need an automatic check for unit test coverage. And yes, of course, there are limits to what such an automated coverage tool can do.</p>
<p>With regard for failing unit tests, the &#8220;iterable&#8221; branch did have anymore failures and errors than the head branch.</p>
<p><strong>Note missing JavaDoc</strong></p>
<p>We&#8217;ve go DocCheck results on our CDK nightly pages but nothing tells you whether a patched method is missing neccessary JavaDoc. Presumably, we could &#8220;grep&#8221; the patches class names into a DocCheck input file and get customized info about it.</p>
<p><strong>Warn for subjected PMD warnings</strong></p>
<p>PMD is a tool for checking code with respect to adherence to certain coding standards.Again, the <a href="http://cheminfo.informatics.indiana.edu/~rguha/code/java/nightly/" onclick="javascript:pageTracker._trackPageview('/outbound/article/cheminfo.informatics.indiana.edu');" target="_blank">CDK nightly page</a> contains all PMD reports on the CDK code, generated in nightly runs. The same can be achieved for each branch with a &#8220;ant -f pmd.xml&#8221;  on your local copy of the branch.</p>
<p><strong>Optionally note other problems</strong></p>
<p>I love optional things and tend to let them be optional</p>
<p><strong>Optionally any other comment you have</strong></p>
<p>Dto.</p>
<p>So, overall I would like to conclude that according to the best of my knowledge, the Iterable patch should be safe and can be applied to the HEAD branch. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2008/09/01/creating-and-reviewing-patches-in-the-chemistry-development-kit-cdk/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Linus on GIT on Google TechTalks</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/08/26/linus-on-git-on-google-techtalks/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2008/08/26/linus-on-git-on-google-techtalks/#comments</comments>
		<pubDate>Tue, 26 Aug 2008 10:31:58 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Blue Obelisk]]></category>
		<category><![CDATA[Chemistry Development Kit]]></category>
		<category><![CDATA[Informatics]]></category>
		<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Scientific Culture]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/index.php/2008/08/26/linus-on-git-on-google-techtalks/</guid>
		<description><![CDATA[I&#8217;m a big fan of Google TechTalks and watch a lot of them during flights. This week I enjoyed the recording of Linus Torvalds insulting all kinds of people including the whole SVN develoment team while introducing his distributed source code management system GIT. Egon had pointed me to GIT quite a while ago but [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m a big fan of Google TechTalks and watch a lot of them during flights. This week I enjoyed the recording of <a href="http://video.google.com/videoplay?docid=-2199332044603874737" onclick="javascript:pageTracker._trackPageview('/outbound/article/video.google.com');">Linus Torvalds insulting all kinds of people including the whole SVN develoment team while introducing his distributed source code management system GIT</a>. <a href="http://chem-bla-ics.blogspot.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/chem-bla-ics.blogspot.com');" target="_blank">Egon</a> had pointed me to <a href="http://git.or.cz/" onclick="javascript:pageTracker._trackPageview('/outbound/article/git.or.cz');" target="_blank">GIT</a> quite a while ago but seeing Linus himself discuss the issue made a difference.While CDK is still considerably smaller than the Linux kernel, I can see a lot of commonalities and I think that with our current development of having our fellow coadmins review important patches and branches, GIT sounds like a much easier way to do it.</p>
<p>In GIT the source code is distributed &#8211; there is no concept of a central source repository. Developers commit their chances to their local GIT systems, with all the advantages of versioning and source code history. Other developers pull code from you if they think that the changes you&#8217;ve advertised via your favourite communication channels are interesting. In theory, this allows for a very democratic and evolutionary code development. In addition to being distributed, GIT seems to be very fast when it comes to merging. Linus reports that he does hundreds of full merges per day and nothing take longer than 5 secs.</p>
<p>In practice, as Linus points out in his talk, there will always be one or very few repositories that people pull from &#8211; for the Linux kernel it will be Linus&#8217;s machine. In CDK it will very likely be <a href="http://chem-bla-ics.blogspot.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/chem-bla-ics.blogspot.com');" target="_blank">Egon</a>&#8217;s. Sorry Egon, you&#8217;ve got to be online all day <img src='http://www.steinbeck-molecular.de/steinblog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>The last sentence already brings me to the point. I wonder if we should give GIT a try for CDK development. The advantages do sound enormous. Ok, there are disadvantage too, such as loosing the central web browsing of the SVN repository on SF. There may be ways around this, as <a href="http://chem-bla-ics.blogspot.com/2007/10/offline-cdk-development-using-git-svn.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/chem-bla-ics.blogspot.com');" target="_blank">Egon decribed here</a>, but this seems like not using the real thing.</p>
<p>This is a brief impression dump after watching Linus&#8217; talk today and I&#8217;m happy to hear your opinions <img src='http://www.steinbeck-molecular.de/steinblog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2008/08/26/linus-on-git-on-google-techtalks/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>PAC on Graphical Representation Standards for Chemical Structure Diagrams</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2008/03/17/pac-on-graphical-representation-standards-for-chemical-structure-diagrams/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2008/03/17/pac-on-graphical-representation-standards-for-chemical-structure-diagrams/#comments</comments>
		<pubDate>Mon, 17 Mar 2008 13:52:05 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Blue Obelisk]]></category>
		<category><![CDATA[Chemistry Development Kit]]></category>
		<category><![CDATA[Chemoinformatics]]></category>
		<category><![CDATA[IUPAC CPEP]]></category>
		<category><![CDATA[Open Standards]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/index.php/2008/03/17/pac-on-graphical-representation-standards-for-chemical-structure-diagrams/</guid>
		<description><![CDATA[The February issue (V80, No 2, P 277-410) of IUPAC&#8217;s Pure and Applied Chemistry Journal has a 133 pages of IUPAC recommendations for the Graphical Representation Standards for Chemical Structure Diagrams. In the Blue Obelisk context, this material is both valuable for the development of our structure representation and editing tools (JChemPaint, JCPViewer) as well [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.iupac.org/publications/pac/80/2/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.iupac.org');" target="_blank">February issue (V80, No 2, P 277-410) of IUPAC&#8217;s Pure and Applied Chemistry Journal</a> has a 133 pages of IUPAC recommendations for the Graphical Representation Standards for Chemical Structure Diagrams. In the Blue Obelisk context, this material is both valuable for the development of our structure representation and editing tools (JChemPaint, JCPViewer) as well as for Structure Diagram Generator (SDG) in the Chemistry Development Kit (CDK).</p>
<p>While the details discussed there are not particularly surprising to the educated chemist, the value of the material lies in that it is a quite complete collection of cases to take into consideration. It would be a nice document to base the next JCP-rendering-engine-related summer of code on (Not sure <a href="http://progz-jchem.blogspot.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/progz-jchem.blogspot.com');" target="_blank">how far Egon and Niels got last time</a>) and there are certainly some things which could very quickly improve the quality of output from CDK SDG, the simplest example being a horizontal alignment of the molecules longest axis.</p>
<p>And if you think, yes, this would be a nice project for me to help those Blue Obelisk projects, <a href="mailto:steinbeck@ebi.ac.uk">please let me know</a> <img src='http://www.steinbeck-molecular.de/steinblog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2008/03/17/pac-on-graphical-representation-standards-for-chemical-structure-diagrams/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interactive Open Access and Collaborative Peer Review</title>
		<link>http://www.steinbeck-molecular.de/steinblog/index.php/2007/06/03/interactive-open-access-and-collaborative-peer-review/</link>
		<comments>http://www.steinbeck-molecular.de/steinblog/index.php/2007/06/03/interactive-open-access-and-collaborative-peer-review/#comments</comments>
		<pubDate>Sun, 03 Jun 2007 13:09:00 +0000</pubDate>
		<dc:creator>Christoph Steinbeck</dc:creator>
				<category><![CDATA[Open Standards]]></category>
		<category><![CDATA[Publishing]]></category>

		<guid isPermaLink="false">http://www.steinbeck-molecular.de/steinblog/?p=20</guid>
		<description><![CDATA[An interesting article by Ulrich Poeschl on Interactive Open Access Publishing and Collaborative Peer Review in the latest issue &#8220;Forschung &#38; Lehre&#8221; describes a publication process practiced for example by &#8220;Atmospheric Chemistry and Physics&#8221; and sister publications. In a first step, a submission is immediately published as a &#8220;discussion paper&#8221; in the online discussion forums [...]]]></description>
			<content:encoded><![CDATA[<p>An interesting article by <a href="http://www.mpch-mainz.mpg.de/~poeschl/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.mpch-mainz.mpg.de');">Ulrich Poeschl</a> on Interactive Open Access Publishing and Collaborative Peer Review in the latest issue &#8220;Forschung &amp; Lehre&#8221; describes a publication process practiced for example by &#8220;Atmospheric Chemistry and Physics&#8221; and sister publications. In a first step, a submission is immediately published as a &#8220;discussion paper&#8221; in the online discussion forums of the journal &#8230;</p>
<p><span id="more-20"></span>&#8230; Both the comments of interested peers as well as those of the official peer reviewers (who may opt to be anonymous) immediately become publicly available together with the discussion paper. The forum is ISSN-registered and all comments are individually citable. If accepted, the final paper and whole discussion leading to its publication becomes openly and permanently accessible.<br />
Poeschl points out a number of advantages of this collaborative peer review process. a) The discussion papers allow for very rapid publication of results. b) There is a motivation for peer reviewers  to provide high quality reviews, since those are openly accessible for the whole scientific community. c) For the reader, the review discussion maybe as enlightening as the article itself. d) The open review process may deter those authors submitting suboptimal manuscript in the hope to take advantage of reviewers work capacity.<br />
<a href="http://www.mpch-mainz.mpg.de/~poeschl/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.mpch-mainz.mpg.de');">Ulrich Poeschl</a> states that the above leads to a lower need for corrections and a lower rejection rate.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.steinbeck-molecular.de/steinblog/index.php/2007/06/03/interactive-open-access-and-collaborative-peer-review/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
