When to Use Workflows

There are several times when it is appropriate to consider using a workflows approach. Here we give some suggestions and hints.

For replication and repeatability

Often, you want to be able to repeat your processing or to replicate it again. In both these cases workflows are appropriate tools. They are especially handy when the tools for running workflows record what is being done, to leave a "provenance record". This assists replication by others.

When you have standard operating procedures / protocols

Many scientific analysis and processing tasks often become the standard way of doing something. Species distribution modelling, for example, or processing of data from Ocean Sampling Day water samples. In these cases adopting a workflow as the standard operating procedure makes it easier for everyone that needs to perform the procedure. 

Variable analysis pipelines with known start and end points 

If you have analysis pipelines that are variable from start to end i.e., if the steps you want to perform are not precisely known at the beginning, but the starting and ending points are, then workflows are appropriate. You can keep many steps the same, whilst altering those that vary. There are several ways this can be done. For example, dynamically with conditional tests in the workflow, or more statically by substitution of one component step for another.

Representing an underlying analytical process or procedure

If there is an underlying scientific or analytical process you are trying to represent, or an entity (such as a dataset) that needs to be pushed through an analytical pipeline then workflows can help. They provide structure for the pipeline and allow recording of what happened.

Part-automated, part-manual pipelines

Often, you may have a processing pipeline that is part-automated, part-manual. That is to say, that between the automated computer-controlled steps there are some manual steps to be performed by a human. For example, in digitisation of specimens human help is needed to position a specimen in some kind of frame where it will be digitised. At some point in the procedure it may be necessary to turn the speciment around or over. Workflows processing the digitised data can pause and alert the operator to do this, receive a confirmation that it has been done and continue.

Performing parameter or data sweeps and managing the results

Sometimes you will want to run the same analytical or processing task many many times, either over a different set of data each time (data sweeping) or over the same set of data but with different parameter settings for the process (parameter sweeping). Workflows represent an efficient mechanism in these cases when combined with the BioVeL Portal to control the sweeping and the management of the multiple sets of results that will be produced. 

When is it not appropriate to use workflows?

When you want to do a one-off piece of analysis or processing it may be better to consider knocking together a simple computer program e.g., in R. In our experience though, one-off's are rarely that because you often need to run them multiple times to get them to work properly. You find that you want to use them again with different data, or you find that you want a record of the steps. If any of these are true, then workflows can offer efficiencies.

Even though a research team may be technically competent, it may have possess limited proficiency in coding for computer programmes. Although programmes can be constructed they may be difficult to use or even unusable outside the team, or even beyond the person in the team that created them. That doesn't matter if such programmes are more or less for personal use but if you think wider use can be expected, its worth sitting back a bit before you start and think about how that goal can be most easily achieved. Workflows are one option where less proficiency may be needed, although like all tools they have their own learning curve.




19 February 2015

At the final review of the project by the EC, one of the reviewers said: “Incredible work done with a community that is not unified. Remarkable work. It opens for new development in a near future. Hope for success. Good project. Happy that you have been financed three plus years ago.”

Read all about the project and its results in the Project Final Report or read the Executive Summary only.