Skip main navigation

Uploading and Formatting count data in ResistoXplorer

In this video Achal Dhariwal demonstrates how to format and upload the resistome data in ResistoXplorer.
Hello. I’m Achal Dhariwal, a PhD researcher at the Institute of Oral Biology in University of Oslo. In this video, I’m going to make you familiarised with how you can format and upload your resistome count data in ResistoXplorer. So let’s quickly open the ResistoXplorer by clicking or opening any browser, and on the search bar or address bar type in order to reach to the homepage of ResistoXplorer. So this is how the main page of ResistoXplorer looks like. As we can see, there are three buttons, which represents three analysis modules supported in ResistoXplorer. For most of this course, we are going to focus on the second module, which is the ARG Table module. So let’s quickly click and get into it.
This brings us to the first page, which is the Data Upload page. As you can see, this page is divided into two panels. The first one is for uploading the resistome count data, while the second panel contains the publicly available example data set available for testing. So now comes to the first panel. As you can see, there are three input files required in tab or comma delimited plain text format. The first one is the ARG abundance table that contains the read count or relative abundance information for all the ARGs identified across the samples. So this is how an example ARG abundance should look like. As you can see, the first line should start with hash name.
The ARGs are present across rows and the samples across columns, and these represent the read count or abundance information of specific ARG and specific sample. The second one is the metadata file, which contains the group information or metadata information for all the samples present in the count abundance table. As you can see, this is one of the example of how that metadata file should look like. So in the columns, we have the– in the first column, we have the sample names, and from the second column onward we can put the experimental vectors containing the group assignment information for each sample.
As you can see here, the sample names of these file should match with the sample name present in the count abundance table. And also, as you can see here, we have only one experimental factor called treatment, which have two groups, which is the amox and the placebo. Next and the third and the last one is the ARG annotation table. As you can see here, we have two options– either to upload this annotation information of all the ARGs present in the count table here in the form of a separate table, or we can just directly select the database that we have used while doing the upstream analysis of resistome sequencing data.
As in ResistoXplorer, we have already collected this functional annotation information from all the widely used databases, so you don’t have to manually collect the information for your count data. And now looking– this is one of an example of how the annotation files should be formatted or how it should look like. As you can see, in rows we have the ARGs, and across column we have the functional level. While formatting, we need to make sure the ARG name and the count data should match with the ARG names present in the annotation file. Also, the first line should start with hash annotation. In this example, we can see all the ARGs have been classified at two levels, the mechanism and the class.
However, if you need more detailed text explanation on how to format these resistome data, you can always go to the data format section or page of ResistoXplorer by directly clicking it here. And this will give you a detailed text explanation regarding the formatting of each file. Once you formatted your data, you can come here and upload the respective data here and click on Submit to upload all the files. However, if there is any issue regarding the formatting of these files, an error message will appear and providing the explanation of what is the error, as you can see here, as we haven’t uploaded any file. And this was showing the same error here. Now let’s look at the bottom panel.
This panel contain the already pre-processed and formatted resistome count abundance table. As you can see here, we have two examples data set that user can easily explore. The brief description about this example data set containing information about the experimental factor, the database used for annotation, as well as link to original publication are provided here for better understanding. Additionally, you can click on these data sets present under data type header and directly download all the underlying data in a zip format.
You can unzip this folder and upload the files here in the first panel and click Submit to upload and perform the analysis, or you can also directly explore these data sets by picking the data set of your interest and click the Submit button in order to upload these data. OK. Now let’s say we have downloaded this example data set and unzipped the folder, as you can see here. Now let’s upload this data in ResistoXplorer in the first panel here. So here I am uploading the ARG abundance table containing the read count information of all the ARGs identified, and then I’m also providing the metadata file containing all the group information for all the samples.
And here in the ARG annotation section, I’m using the database ResFinder, as I already know this data have been annotated using ResFinder database. In order to upload this data, I’m going to click Submit, and let’s see what happens. So if everything is right, we can see the second page, which is the Data Inspection or Data Integrity Check page. So this page provide or summarise the results of uploaded data. As you can see here, there are two tabs. The first one is the text summary that contains the summary of your uploaded data.
As you can see here, we have various statistics regarding how many features are present in the table, how many features are present in more than two samples, how many experimental factor with database used for annotation, how many functional annotation levels present in that database, the sparsity, composition, maximum, minimum, read count, average read count present in the sample, as well as the mapping information as how many samples or sample names matches between abundance or metadata file, or how many resistome genes matches between the annotation table and the count table here. Next, this is the library size overview. It will provide the graphical representation of how many read counts are present in each sample.
All of this information is very useful once we pre-process this data, which we are going to see in the next video that is the data filtration and data normalisation. In order to go there, we just statically have to click Proceed button. So here, as we can see, everything seems OK with our data. And then we can click Proceed in order to proceed further to the pre-processing of this data, which we will talk and cover in the upcoming videos. Till then, I recommend you to upload your resistome count data or try example data set in order to repeat this step. Until then, thanks for watching. We’ll see you very soon. Bye-bye, and take care.

In this video, we will demonstrate how to format and upload the resistome data in ResistoXplorer. When the video is playing, feel free to follow along on your computer!

For more detailed information on how to format your own resistome count data or files to make it accessible for uploading to ResistoXplorer, go to the ResistoXplorer home page and click on “Data Format” tab from the top right menu bar.

Do it yourself:

While the example datasets can be downloaded by clicking on the “Downloads” button under the “Resources” tab from the menu bar present on the home page, in the “Downloads” page, click on any example data set of interest (for this 3rd exercise) and the zipped folder containing all the required files will be automatically downloaded to your computer. After they are downloaded, unzip the folder so that all files will be accessible for upload to ResistoXplorer.
This article is from the free online

Exploring the Landscape of Antibiotic Resistance in Microbiomes

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education