DownloadSRAData.pdf

From Array Suite Wiki

Jump to: navigation, search

Download SRA Data

The "Download SRA Data" command allows the user to specify an SRA ID for downloading public sequencing data for use in Array Studio. The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms.

This download method only supports the download of sequencing data (NGS) projects.

SRA download 1.png

SRA Download EBI NCBI Option.png

The user must first specify an output folder to place the retrieved data by selecting the "Browse" button and selecting the appropriate folder location.

SRA files will automatically convert to fastq.gz files, which can be imported to ArrayStudio for further analysis.

Options

Combine Runs Within An Experiment: If input are SRX IDs, the option would combine SRR files within a SRX into one single fastq file (SRXID.fastq.gz). When individual SRR or combination of SRR and SRX IDs are used as input, users will need to set combineRun=False to download the files individually.

Use Aspera: Download using Aspera if checked, and using wget if not.

Use Cluster: If checked, a job will be submitted to your server's cluster (if available) to retrieve the SRA data. If left unchecked, the server "head node" (where Array Server is running ) will process. Downloading large numbers of files can use significant resources, so it is generally recommended to "Use Cluster" if available.

Download from EBI: If checked, file download is performed from EBI first. If file is not found on EBI will try download from NCBI.

Download from NCBI: If checked, file download is performed from NCBI.

There is also an option to preview the records that will be retrieved prior to running the module.

SRA 2.png

Results

Once completed, the requested files will be downloaded to the specified directory.