Upload Data into AMR++
In the Use Tutorial Data section, we upload a sample data set using the Get Data module. This section is for those who are new to Galaxy and who want to get more comfortable with the Galaxy interface.
Upload Your Data
In the Upload My Own Data section, we use FileZillaFileZilla is a powerful, free file management tool for FTP access to your Web site's server. to upload data via FTP. This section is for those who have large data sets (>2 GB) that they want to upload and run.
New to Galaxy?
This section should get you up to speed with Galaxy and how to upload data.
Download Sample Data
Let's get some sample data. Download the example data. The example data in this tutorial was taken from the full dataset of the human genome (hg38), along with a paired metagenome sample (SRR532663) from the Human Microbiome Project (HMP) on NCBI.
Once downloaded, unzip the sample_data.zip archive. You should see four files:
There are a couple ways that we can upload these files to Galaxy:
- Via the Galaxy Get Data module
- Via a file transfer protocol (FTP)
Currently, the Geta Data module only supports file uploads that are
2GB. If you wish to upload files larger than 2GB, you will need some FTP software such as
FileZilla, however, it is not required for the sample data provided in this tutorial.
Let's upload our data to Galaxy. First, make sure that your Galaxy server is running. Right now, you should be staring at a page similar to the following:
Click on the Get Data link circled in red, and navigate to the underlined Upload File link.
From here, we are given the options of Choosing a local file (highlighted in blue) or Choosing an FTP file (highlighted in orange). Let's choose a local file since our data is less than 2GB. If you want to upload via FTP, see the Upload My Own Data section.
Navigate to the folder where you extracted the sample data, select the appropriate datasets, and upload them.
Once selected, navigate to the Type column (circled in red) and specify the file type (highlighted in blue) for each dataset from the drop down menu. For each fastq dataset, select the fastqsanger option, and for each fasta dataset, select the fasta option. Next, click the Start button to begin uploading the data to your Galaxy server.
If everything goes well, you should see all green. That's good! You can find the uploaded datasets in the right hand side of the Galaxy homepage. Now, let's head over to the Run Workflow section to run the pipeline.
- OR -
Using FTP to Manage Your Data
In this section, we'll cover what you need to know to upload your own data via FTP.
If you haven't already done so, head on over to FileZillaFileZilla is a powerful, free file management tool for FTP access to your Web site's server. and download their FTP client. It's free and very easy to use.
Enter FTP Credentials
Connecting to your FTP server is really easy. Navigate to the site manager (circled in red):
Click on the New Site button (circled in red) and enter a name for your FTP connection. Next, click on the Transfer Settings button (circled in orange) and check the Active radio button under Transfer mode. Go back to the General tab and enter in your credentials; replacing the hostname with your own. Under User, enter in firstname.lastname@example.org and under Password enter admin.
Once connected, we can upload some files. On my machine, I have files on my Desktop under a directory called sample data (highlighted in blue). Navigate to the folder where your data is. You will see a list of files in that directory through the FTP client interface (highlighted in red). Simply drag and drop the desired files to the root directory as illustrated below.
Upload FTP Files
Once the transfer has completed, navigate to the Get Data module on your Galaxy homepage and select the Upload File link.
Next, click on the Choose FTP file option.
Check all the boxes for each dataset.
Next, find the Type tab in the next window and specify the file type for each file from the drop down menu. Choose fastq sanger for your fastq datasets and fasta for your fasta datasets. That's it! You should then see the files loading in your History pane on the Galaxy homepage. Now, let's head over to the Run Workflow section to run the pipeline.
You may need to refresh the history pane to see the uploaded datasets, which you can do by clicking on the Refresh history icon directly to the right of the history pane.