Skip main navigation

Building a workflow on Galaxy

A tutorial for making workflows on the galaxy bioinformatics platform.
© COG-Train ©Galaxy Training Network

By now, you are becoming familiar with the concepts of loading up an input file into Galaxy, performing a task with that file, and saving the output of this task. You will generally use one input file when learning about input file types and expected outputs. Once you are familiar with the flow of your system, you will need to introduce techniques to increase the “throughput” of your activities.

Throughput is a measurement of how many units of information a system can process in a given amount of time. A person selecting each subsequent step or operation will always have limited throughput. Computers, however, are excellent at repeating steps once these are configured. It is possible to scale up to massive computer clusters and data centres that can process terabytes of information in minutes.

Many data science tasks require the same processing or operations to happen to different input files. We will explore the “workflow” technique to explore the path of input and outputs of a series of tasks.

Galaxy has a workflow tutorial for building a workflow on the platform, and this was used to formulate the basis of this article. For more information, please explore the following resources:

Creating, Editing and Importing Galaxy Workflows (Galaxy Training Materials)

Community-Driven Data Analysis Training for Biology Cell Systems

To create a workflow, click on “Workflow” on the top panel in the Galaxy homepage, then see the “Create and Import” buttons on the top right, and click “Create”. Give your workflow a name – something memorable related to the task you wish to perform is best.

A screenshot of the steps used to create a new workflow on Galaxy. Click here to enlarge image

In this example, we’ll make a workflow to flip a “digital burger”

Bread

Onions

Cheese

Burger Patty

Bread

the annotation naming of a galaxy workflow
Click here to enlarge image

After creating your workflow, you will see a blank canvas

The blank canvas of a galaxy workflow
Click here to enlarge image

You can add some annotations on the right-hand side panel to describe
what this workflow does

On the left-hand side panel, click “Input Dataset”, and you will see a marker for
this task on your canvas. Then you will be able to click on this to add
some annotation data on the right-hand side panel

add the input dataset tool
Click here to enlarge image

Then add the tool “tac reverse a file” by searching for it on the left
hand side panel

add the reverse a file tool and link to the input dataset
Click here to enlarge image

You can add annotations for this too. Now link the two steps by dragging
the small arrow “>” on the load burger step onto the arrow on the tac
step (labelled flip burger).

Now click on the triangle in the top right corner to “run workflow”. You
will see a page with the workflow name and an upload space for loading
the “burger data” Click the vertical arrow on the right-hand side to initiate a
file upload.

the file upload menu
Click here to enlarge image

the file upload menu on the paste input section
Click here to enlarge image

Select “Paste/fetch data”, then fill in the details – see examples below.
You must add the name of the file, and the type (text file is format
txt), then paste the lines of the “digital burger” as shown above. Once
filled, click “Start”.

The run workflow window after upload set
Click here to enlarge image

You can then hit, “Start workflow”, and make sure the input file specified is
the burger file you made (select it from the dropdown menu). Once the job is complete, you will see the
output available via the history panel on the right.

the workflow complete screen
Click here to enlarge image

Our burger has been flipped! So now the order is:

Bread

Burger Patty

Cheese

Onions

Bread

Congratulations on making your first workflow! Now try to expand that workflow by trying to restore the order of the “burger”, add some ingredients into your input file, or try another tool to disassemble it! Share what you try out in the comments below.

Be sure to try the Galaxy tutorial too to learn a few more ways to use or embed workflows.

© COG-Train ©Galaxy Training Network
This article is from the free online

Making sense of genomic data: COVID-19 web-based bioinformatics

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education