BioVeL publication

"A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control"

Cherian Mathew, Anton Güntsch, Matthias Obst, Saverio Vicario, Robert Haines, Alan R. Williams, Yde de Jong, Carole Goble


The compilation and cleaning of data needed for analyses and prediction of species distributions is a time consuming process requiring a solid understanding of data formats and service APIs provided by biodiversity informatics infrastructures. We designed and implemented a Taverna-based Data Refinement Workflow which integrates taxonomic data retrieval, data cleaning, and data selection into a consistent, standards-based, and effective system hiding the complexity of underlying service infrastructures. The workflow can be freely used both locally and through a web-portal which does not require additional software installations by users.


biodiversity informatics, web services, workflows, service oriented architecture, data cleaning, e-Science

Read article




19 February 2015

At the final review of the project by the EC, one of the reviewers said: “Incredible work done with a community that is not unified. Remarkable work. It opens for new development in a near future. Hope for success. Good project. Happy that you have been financed three plus years ago.”

Read all about the project and its results in the Project Final Report or read the Executive Summary only.