OShell Online Help

From Array Suite Wiki

Jump to: navigation, search

Warning.png WARNING: We have completely migrated the Oshell to work in environment setting as described in Oshell. The development of these subcommands has been superseded. We only support these subcommands through the end of year 2013.

Contents

Important Note

We have completely migrated the Oshell to work in environment setting as described in Oshell article. The development of subcommands described here has been discontinued. We only support these subcommands through the end of year 2013.

System Requirements

oshell runs under both Windows and Linux (requiring MONO) environments and in both 32- and 64-bit modes. 64-bit mode with 12GB RAM or more is recommended.

oshell can run on Windows directly, but Linux requires MONO to be installed. Version 2.10.9 is required. The instruction for installing MONO is here, and the steps for installing oshell is here.

NGS

Build Reference Library and Gene Model

oshell requires the user to have a reference genome and reference gene model built for most analysis. By default, Omicsoft will automatically download a compiled genome and gene model from our server, if they are available. In addition, users can also choose to build their own reference library by following commands:

Building a Reference Library


COMMAND
Specify the following command:

oshell.exe --buildref oshell_Omicsoft_Dir fasta_file_name ref_lib_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

fasta_file_name

ref_lib_name

Building A Gene Model


COMMAND
Specify the following command:

oshell.exe --buildgm oshell_Omicsoft_Dir gtf_file_name ref_lib_name gene_model_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

fasta_file_name

ref_lib_name

gtf_file_name: specify the full path of the .gtf file for the gene model. The .gtf file must be compatible with the associated reference library

GeneModel

Preprocess

Filter

This module can be used to preprocess FASTQ, BAM, SAM, SFF, or QSEC files before importing. Quality encoding can be used to trim the data before filtering


COMMAND
Specify the following command:

oshell.exe --filter oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
QualityEncoding

TrimByQuality

ReadTrimQuality

EnableLengthCutoff

LengthCutoff

EnableMaxQualityCutoff

MaxQualityCutoff

EnableAverageQualityCutoff

AverageQualityCutoff

EnablePolyRateCutoff

PolyRateCutoff

Gzip

PairedEnd

FilterPairByBothEnds

ThreadNumber

MidExtraction

This module can be used to preprocess FASTQ, FASTA, QSEC or SFF files before importing. A Multiplex Identifier (MID) sequence that may have be added in between the primer and template-specific sequences (also referred to as “barcoding”) can be stripped out using this module.


COMMAND
oshell.exe --preprocess oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

Gzip

ExtractRightEndMid

ThreadNumber

LeftEndOnly

RightEndOnly

Quality Control

Basic Statistics

The Basic Statistics module generates some simple composition statistics for the files analyzed.


COMMAND
Specify the following command:

oshell.exe --basicstats oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

CompressionMethod

Gzip

PreviewMode

MappedOnly

ThreadNumber

PerSequenceGCDistribution

SequenceLengthDistribution

Base Distribution

The NGS Base Distribution module is a quality control module for Next Generation Sequencing data. FASTQ, FASTA, QSEC, SFF, AUTO, SAM and BAM. This module is used to look at the base distribution for the reads in the raw data files. This can be used to check for uniformity between the different bases, as one would expect to see about equal distribution of the four bases across the length of the read. One potential reason for non-equal distribution could be the inclusion of adapters in the read (which can be stripped using the Adapter Stripping section of the module). Optionally, the user can look at GC distribution instead of base pair distribution (A, C, T, G %).


COMMAND
Specify the following command:

oshell.exe --basedist oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
CalculateGCDistribution

FileFormat

MaxPosition

ViewType

Gzip

PreviewMode

Quality Boxplot

The Quality BoxPlot module is a quality control module for Next Generation Sequencing data. FASTQ,, QSEC, SFF, AUTO, SAM and BAM. This module is used to look at the quality score for each base pair in a file (aggregated over all reads from that file). It gives the user an idea of where the quality score starts to drop off for each file. Although quality trimming is an automatic (though optional) feature of Omicsoft’s mapping algorithms, it is helpful to get a general sense of the quality across all reads before beginning an experiment.


COMMAND
Specify the following command:

oshell.exe --qboxplot oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
AdapterStripping 3'End

AdapterSequence

ExcludeUnmatched

FileFormat

QualityEncoding

MaxPosition

Gzip

PreviewMode

Sequence Duplication

This module counts the degree of duplication for every sequence in the set and creates a plot showing the relative number of sequences with different degrees of duplication. In a diverse library most sequences will occur only once in the final set. A low level of duplication may indicate a very high level of coverage of the target sequence, but a high level of duplication is more likely to indicate some kind of enrichment bias (e.g. PCR over amplification).


COMMAND
Specify the following command:

oshell.exe --seqdup oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
AdapterStripping 3'End

AdapterSequence

ExcludeUnmatched

FileFormat

CalculateGCDistribution

MaxPosition

Gzip

PreviewMode

ThreadNumber

MappedOnly

CalculateKMerPatterns

Selecting this option will generate a table containing information for K-mer (5-mer) repetitive sequences. The default value = False.

ContaminationSource=DefaultList

Specifies the use of the default contamination list which is automatically accessed by the module.

DefaultContimationListVersion

If multiple versions of the contamination list are available, the user can select version of choice. The default is the most current list version.

ContaminationListFileName

Specifies the file path of a custom contamination list to use in place of the default list. The list consists of each row containing two fields. First files is possible source, and second field is sequence. Use a space or tab to separate.

RNA-Seq Mapping Profile

This module has all the same options as the "Map RNA-Seq Reads To Transcriptome" module, however it is used to provide a profile of the mapped regions. It returns the RNA type for the mapping from ENSEMBL, i.e. Mitochondrial rRNA, Protein Coding region, Pseudogene, miRNA, etc. and how many reads map to each of these types. A read is considered ambiguous if a read maps to multiple RNA types (usually reads will map to more than one isoform, but these isoforms usually have the same type).


COMMAND
Specify the following command:

oshell.exe --rseqprofile oshell_Omicsoft_Dir ref_lib_name_gene_model_name control_file_name

Note: The transcriptome in the command above is denoted using the ref_lib_name and gene_model_name joined by an underscore.


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name_GeneModel: Specify the transcriptome which is the combination of reference library and gene model ID (Human.B37_ RefGene)

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
AdapterStripping 3'End

AdapterSequence

ExcludeUnmatched

PairedEnd

FileFormat

AutoPenalty

FixedPenalty

Greedy

Use32BitMode

ExcludeNonUniqueMapping

ReportCutoff

ThreadNumber

TrimByQuality

ReadTrimSize

ReadTrimQuality

InsertSizeStandardDeviation

ExpectedInsertSize

InsertOnSameStrand

InsertOnDifferentStrand

QualityEncoding

Gzip

PreviewMode

RNA-Seq 5'->3' Trend

The "RNA-Seq 5'->3' Trend " module is a QC module that generates trend plots for the top expressing genes in a file or set of files.  It can be used to determine if there is any degradation on either the 5' end or 3' end by assessing the overall reported level of counts on either end.  By default, this module reports the top 50 expressing genes, however this value can be changed and the user can input their own specific control genes.


COMMAND
Specify the following command:

oshell.exe --trend53 oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
AdapterStripping 3'End

AdapterSequence

ExcludeUnmatched

FileFormat

AutoPenalty

FixedPenalty

Greedy

Use32BitMode

ExcludeNonUniqueMapping

ReportCutoff

ThreadNumber

TrimByQuality

ReadTrimSize

ReadTrimQuality

QualityEncoding

Gzip

PreviewMode

BinNumber

TopCount

Insert Size Profile

The NGS "Insert Size Profile" module is a quality control module for Next Generation Sequencing data. FASTQ, FASTA, and QSEC. This module is used to look at the insert size distribution for the reads in the raw data files. This module has all the same options as the "Map RNA-Seq Reads To Transcriptome" module.


COMMAND
Specify the following command:

oshell.exe --isize oshell_Omicsoft_Dir ref_lib_name_gene_model_name control_file_name

Note: The transcriptome in the command above is denoted using the ref_lib_name and gene_model_name joined by an underscore.


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name_GeneModel: Specify the transcriptome which is the combination of reference library and gene model ID (Human.B37_ RefGene)

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
RnaMode

FileFormat

AutoPenalty

FixedPenalty

Greedy

Use32BitMode

ExcludeNonUniqueMapping

ReportCutoff

ThreadNumber

TrimByQuality

ReadTrimSize

ReadTrimQuality

InsertSizeStandardDeviation

ExpectedInsertSize

InsertOnSameStrand

InsertOnDifferentStrand

QualityEncoding

Gzip

PreviewMode

ReportSummary

Manage Data

Map Illumina Reads to Genome (DNA-Seq)

The AlignDNA module allows the user to map raw sequence reads (FASTQ, FASTA, or SFF) to the genome, and return a number of summary statistics, and a NGS dataset (used for further downstream analysis like mutation generation, paired fusion gene detection, and more).


COMMAND
Specify the following command:

oshell.exe --aligndna oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS(Click to see example control file)
PairedEnd

FileFormat

AutoPenalty

FixedPenalty

IndelPenalty

DetectIndels

Greedy

Use32BitMode

ExcludeNonUniqueMapping

ReportCutoff

WriteReadsInSeparateFiles

MaxMiddleInsertionSize

MaxMiddleDeletionSize

MaxEndInsertionSize

MaxEndDeletionSize

MinDistalEndSize

GenerateSamFiles

ThreadNumber

GenerateAlignmentSummary

TrimByQuality

ReadTrimSize

ReadTrimQuality

InsertSizeStandardDeviation

ExpectedInsertSize

QualityEncoding

Gzip

Map long Reads to Genome

The AlignLDNA module allows the user to map raw sequence reads that are longer than the standard length (i.e. this is used for alignment of 454 reads, as well as Pacific Bio reads and newer Illumina reads) (FASTQ, FASTA, QSEC or SFF) to the genome, and return a number of summary statistics, and a NGS dataset (used for further downstream analysis like mutation generation, paired fusion gene detection, and more). This module is extremely useful for detection of indels for these types of reads, as the regular alignment module can only detect single indels.


COMMAND
Specify the following command:

oshell.exe --alignldna oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

Use32BitMode

SeedLength

Bandwidth

MatchScore

MismatchPenalty

OpenGapPenalty

ExtendGapPenalty

GenerateSamFiles

ThreadNumber

TrimByQuality

ReadTrimSize

ReadTrimQuality

QualityEncoding

Gzip

Map Illumina Reads to Genome (RNA-Seq)

The AlignRNA module allows the user to map raw sequence reads (FASTQ, FASTA, or QSEC) to the genome, and return a number of summary statistics, an NGS dataset (used for further downstream analysis like exon junction generation, paired fusion gene detection, and more), as well as return a microarray dataset containing expression values.


COMMAND
Specify the following command:

oshell.exe --alignrna oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
PairedEnd

FileFormat

AutoPenalty

FixedPenalty

Greedy

Use32BitMode

ExcludeNonUniqueMapping

ReportCutoff

WriteReadsInSeparateFiles

GenerateSamFiles

ThreadNumber

GenerateAlignmentSummary

TrimByQuality

ReadTrimSize

ReadTrimQuality

InsertSizeStandardDeviation

ExpectedInsertSize

InsertOnSameStrand

InsertOnDifferentStrand

QualityEncoding

Gzip

ExpressionMeasurement

Map 454 Reads to genome

The Map Long RNA-Seq Reads module allows the user to map raw sequence reads that are longer than the standard length (i.e. this is used for alignment of 454 reads, as well as Pacific Bio reads and newer Illumina reads) to the genome. It will return a number of summary statistics, and a NGS dataset (used for further downstream analysis like mutation generation, paired fusion gene detection, and more). This module is extremely useful for detection of indels for these types of reads, as the regular alignment module can only detect single indels.


COMMAND
oshell.exe --alignlrna oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

Use32BitMode

ExcludeNonUniqueMapping

ReportCutoff

SeedLength

Bandwidth

MatchScore

MismatchPenalty

OpenGapPenalty

ExtendGapPenalty

GenerateSamFiles

ThreadNumber

TrimByQuality

ReadTrimSize

ReadTrimQuality

QualityEncoding

Gzip

Map Illumina Reads to Transcriptome

The AlignRNAT module allows the user to map raw sequence reads (FASTQ, FASTA, or QSEQ) to the transcriptome, and return a number of summary statistics, an NGS dataset (used for further downstream analysis like exon junction generation, paired fusion gene detection, and more), as well as return a microarray dataset containing expression values. This will map the reads to KNOWN transcripts, as opposed to the genome mapping module, which maps reads to the entire genome instead. The default transcriptome is RefSeq, although the user can use their own transcriptome for mapping as well.


COMMAND
Specify the following command:

oshell.exe --alignrnat oshell_Omicsoft_Dir ref_lib_name_gene_model_name control_file_name

Note: The transcriptome in the command above is denoted using the ref_lib_name and gene_model_name joined by an underscore.


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name_GeneModel: Specify the transcriptome which is the combination of reference library and gene model ID (Human.B37_ RefGene)

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
PairedEnd

FileFormat

AutoPenalty

FixedPenalty

Greedy

Use32BitMode

ExcludeNonUniqueMapping

ReportCutoff

WriteReadsInSeparateFiles

GenerateSamFiles

ThreadNumber

GenerateAlignmentSummary

TrimByQuality

ReadTrimSize

ReadTrimQuality

InsertSizeStandardDeviation

ExpectedInsertSize

InsertOnSameStrand

InsertOnDifferentStrand

QualityEncoding

Gzip

ExpressionMeasurement

Map miRNA-Seq Reads to Transcriptome

The AlignmiRNA module allows the user to map raw miRNA sequence reads (FASTQ, FASTA, or QSEQ) to the transcriptome, and return a number of summary statistics, an NGS dataset (used for further downstream analysis like exon junction generation, paired fusion gene detection, and more), as well as return a microarray dataset containing expression values. This will map the reads to known transcriptomes of miRNA, and the data can be summarized at the miRNA level or the sequence level, depending on the user’s choice.


COMMAND
Specify the following command:

oshell.exe --alignmirna oshell_Omicsoft_Dir ref_lib_name_gene_model_name control_file_name

Note: The transcriptome in the command above is denoted using the ref_lib_name and gene_model_name joined by an underscore followed by .miRNA to indicate an miRNA specific transcriptome.


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name_GeneModel: Specify the transcriptome which is the combination of reference library and gene model ID (Human.B37_ RefGene.miRNA)

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

AutoPenalty

FixedPenalty

Use32BitMode

SummaryLevel

WriteReadsInSeparateFiles

GenerateSamFiles

ThreadNumber

GenerateAlignmentSummary

TrimByQuality

ReadTrimSize

ReadTrimQuality

QualityEncoding

Gzip

ExpressionMeasurement

MinimalSequenceCount

Summarize

Flag Summary Statistics

The Summarize Flag Statistics module allows the user to generate some summary statistics for the NGS data types (mapped reads). This can be run on either NGS data imported by using Omicsoft’s alignment modules, or those imported from outside mappings. This can be useful for looking at some QC metrics for your mapped reads, and is especially useful for looking at summary information for paired reads.


COMMAND
Specify the following command:

oshell.exe --flag oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

ThreadNumber

Mapping Summary Statistics

The Mapping Summary Statistics module allows the user to generate some mapping statistics for the NGS data types (mapped reads). This can be run on either NGS data imported by using Omicsoft’s alignment modules, or those imported from outside mappings. This can be useful for looking at some QC metrics for your mapped reads, and is especially useful for looking at summary information for paired reads.


COMMAND
Specify the following command:

oshell.exe --mstats oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

ThreadNumber

Coverage Summary Statistics

The Summarize Coverage Statistics module can be used to calculate the coverage of the mapping, and can be run on either data imported using Omicsoft’s mapping algorithm, or outside imported data.


COMMAND
Specify the following command:

oshell.exe --coverage oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

BaseQualityCutoff

MapQualityCutoff

MinimalIndelSize

ExcludeSingleton

ThreadNumber

GenerateReport

GenerateBedGraphFile

BinSize

Amplicon Based Coverage Statistics

The "Amplicon Based Coverage Statistics" module is used to calculate coverage statistics, but only for specified regions or “amplicons”. This would be used if you ran an experiment on only a specific set of genes or area of a chromosome, and want to see the coverage statistics for only those regions (or if you were only particular interested in a set of genes or regions and wanted to see the coverage statistics).


COMMAND
Specify the following command:

oshell.exe --amplicon oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
MapQualityCutoff

ExcludeSingleton

ThreadNumber

AmpliconFileName

Target Sequencing Coverage Statistics


COMMAND
Specify the following command:

oshell.exe --target oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
MapQualityCutoff

ExcludeSingleton

ThreadNumber

CustomCollectionFile

Summarize - Mutation Data

The Summarize Mutation Data module can be used to generate a new report, with frequencies for each mutation in the RNA-Seq or DNA-Seq data (SAM files or BAM files). This generates a mutation dataset that can be used for further downstream analysis, along with potentially a coverage dataset as well.


COMMAND
Specify the following command:

oshell.exe --mutation oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

BaseQualityCutoff

MapQualityCutoff

MinimalIndelSize

ThreadNumber

ExcludeSingleton

MinimalTotalHit

MinimalMutationHit

MinimalMutationFrequency

LeftExclusion

RightExclusion

Summarize - SNP Data

The Summarize SNP Data module can be used to generate a new SNP report based on the imported RNA-Seq or DNA-Seq data (SAM files or BAM files). This generates a SNP dataset that can be used for further downstream analysis, along with potentially a coverage dataset as well.


COMMAND
Specify the following command:

oshell.exe --snp oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
BaseQualityCutoff

MapQualityCutoff

ExcludeSingleton

ThreadNumber

MinimalTotalHit

ScoreCutoff

MaximalScoreRatio

Report Gene/Transcript Counts

The Report Gene/Transcript Counts module reports either the gene counts or transcript counts for an already imported NGS dataset. This might be for cases where the user did not have Array Studio count the RNA-Seq alignments on import, or a case where they might want to use a different counting method. Methods include TPM (Transcripts per million expression value) or RPKM (reads per kilobase of exon model per million mapped reads) at either the Genome level or Transcript level.


COMMAND
Specify the following command:

oshell.exe --count oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
FileFormat

ExpressionMeasurement

ThreadNumber

Report Exon/Exon Junction Counts


COMMAND
oshell.exe --exoncount oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
ThreadNumber

ReportExonCounts

ReportExonJunctionCounts

ExcludeSingletons

RpkmOption

Report Exon Junctions

The Report Exon Junctions module can be used to generate a new dataset, with counts for each exon junction from the aligned data.


COMMAND
Specify the following command:

oshell.exe --exonjunc oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
GenerateReport

GenerateBedFile

MinimalHit

ThreadNumber

Merge Summary Files

The Merge Summary Files module merges multiple fusion summary files together. This module allows for merging batches of summaries files into a single report.


COMMAND
Specify the following command:

oshell.exe --msum oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
SummaryType

ReportUnannotatedFusion

MinimalHit

FilterBy

DefaultFilterListVersion

FilterGeneListFileName

FilterGeneFamilyFileName

Fusion

Map Fusion Reads (Single End Illumina)

oshell is designed to detect and align fusion junction-spanning reads to the genome directly. It can be applied to unmapped reads after regular single end or paired end alignment.


COMMAND
Specify the following command:

oshell.exe --semap oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir: specify the base directory of the oshell program. This defines the base directory of index files (under subfolder ReferenceLibrary), temporary files (under subfolder Temp), and annotation files (under subfolder Annotation). It is recommended to fix this base directory in order to avoid redownloading of reference genome files.

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
RnaMode

SearchNovelExonJunction

FileFormat

AutoPenalty

FixedPenalty

IndelPenalty

DetectIndels

MaxMiddleInsertionSize

MaxMiddleDeletionSize

MaxEndInsertionSize

MaxEndDeletionSize

MinDistalEndSize

Use32BitMode

ThreadNumber

TrimByQuality

ReadTrimSize

ReadTrimQuality

QualityEncoding

Gzip

MinimalFusionAlignmentLength

FilterUnlikelyFusionReads

MinimalFusionSpan

MinimalHit

ReportUnannotatedFusion

OutputFusionReads

FusionReportCutoff

NonCanonicalSpliceJunctionPenalty

FilterBy

DefaultFilterListVersion

FilterGeneListFileName

FilterGeneFamilyFileName

Report Fusion Genes (Paired End)

oshell can also detect and align fusion junction-spanning reads to the genome directly. For paired-end reads, additional fusion information can be extracted based on read pairs with two ends uniquely aligned to different genes. oshell includes a simple module taking a set of BAM or SAM aligned files, and detects potential fusions based on discordant read pairs.


COMMAND
Specify the following command:

oshell.exe --pereport oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
RnaMode

MinimalHit

ReportUnannotatedFusion

OutputFusionReads

FusionReportCutoff

FileFormat

FilterBy

DefaultFilterListVersion

FilterGeneListFileName

FilterGeneFamilyFileName

Annotation

Annotate Mutation Report

Mutation annotation can be performed on a table report (e.g. table output from mutation module). The table file is expected to have at least four columns “ID”, “Chromosome”, “Position” and “Mutation”. The column names can be customized in the control file. The command is:


COMMAND
Specify the following command:

oshell.exe --annotation oshell_Omicsoft_Dir ref_lib_name gene_model_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

GeneModel

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
DbsnpVersion

CdsOnly

Manage Files

Subset Files

The Subset NGS data module can be used to subset any NGS read data. This can be accomplished by sub-setting references and/or observations. In most cases, references will be chromosomes. The resulting dataset will be a subset of the selected dataset.


COMMAND
Specify the following command:

oshell.exe --subset oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
Region

FileFormat

OutputFileFormat

Export Alignments

The Export module can be used to export NGS Read data to a variety of output formats, including BAM and SAM.


COMMAND
Specify the following command:

oshell.exe --export oshell_Omicsoft_Dir ref_lib_name control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ref_lib_name

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
Format

Merge Files

The Merge Data command is used to merge multiple NGS files into one file, or by group. This can be useful if the NGS files were previously split by chromosome or some other factor. File formats for input include "Binary sequence Alignment/Mapping" (BAM) and "Sequence Alignment/ Mapping" (SAM) formats.


COMMAND
Specify the following command:

oshell.exe --merge oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
SourceFormat

SortByCoordinate

MergeByGroup

TargetFormat

Convert Files

The Convert command allows the user to specify a set of input files, and convert those files to a set of target file formats. Source file formats include BAM, SAM, FASTQ, FASTA, and SOAP.


COMMAND
Specify the following command:

oshell.exe --convert oshell_Omicsoft_Dir control_file_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

control_file_name


CONTROL FILE PARAMETERS (Click to see example control file)
SourceFormat

TargetFormat

Resolver Migration

Resolver Convert


COMMAND
Specify the following command:

oshell.exe --resolver_convert oshell_Omicsoft_Dir ceiba_input_folder omicsoft_object_folder


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

ceiba_input_folder: Specify the ceiba input folder.

omicsoft_object_folder

Resolver Merge


COMMAND
Specify the following command:

oshell.exe --resolver_merge oshell_Omicsoft_Dir omicsoft_object_folder omicsoft_project_folder


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

omicsoft_object_folder

omicsoft_project_folder: Specify the omicsoft project folder.

Omic Modules

Executing an Omic Script


COMMAND
Specify the following command:

oshell.exe --runscript oshell_Omicsoft_Dir script_name


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

script_name: Specify the name of the script to be executed.


Exporting

Exporting an Object

Syntax 1:

Using the syntax in this command will allow the user to export an Omicsoft Object file and save it as a text file in a user specified location.


COMMAND
Specify the following command:

oshell.exe --exportobj oshell_Omicsoft_Dir omicsoft_object_file omicsoft_output_file


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

omicsoft_object_file

omicsoft_output_file

Syntax 2:

Using the syntax in this command will allow the user to export an Omicsoft Object file along with the associated design file.


COMMAND
Specify the following command:

oshell.exe --exportobj oshell_Omicsoft_Dir omicsoft_object_file omicsoft_output_file omicsoft_output_design_file


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

omicsoft_object_file

omicsoft_output_file

omicsoft_output_design_file

Syntax 3:

Using the syntax in this command will allow the user to export an Omicsoft Object file along with the associated Design file and Annotation file.


COMMAND
Specify the following command:

oshell.exe --exportobj oshell_Omicsoft_Dir omicsoft_object_file omicsoft_output_file omicsoft_output_design_file omicsoft_output_annotation_file


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

omicsoft_object_file

omicsoft_output_file

omicsoft_output_design_file

omicsoft_output_annotation_file: Specifying this string in the command will cause the Annotation file to be saved as a text file in the base directory.

Special usage:

"SKIP" can be used in place of an output file types to exclude it from exporting. If the output file name is set to “SKIP” then it won’t be exported. For example, the following command only exports the design data of the microarray object file.


COMMAND
Specify the following command:

oshell.exe --exportobj oshell_Omicsoft_Dir SKIP omicsoft_output_design_file


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

omicsoft_object_file

SKIP: Specifying "SKIP" will prevent the Object file from being saved as a text file in the base directory.

omicsoft_output_design_file

Exporting a Project

Syntax 1:

This command will allow the user to export an Omicsoft Project file to a specified output folder in the form of multiple text files.


COMMAND
Specify the following command:

oshell.exe --exportprj oshell_Omicsoft_Dir omicsoft_project_file omicsoft_output_folder


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

omicsoft_project_file

omicsoft_output_folder

Syntax 2:

This command will allow the user to export an Omicsoft Project file to a specified output folder in the form of multiple text files. It also lets you specify whether or not to export out the chip annotation data associated with the project.


COMMAND
Specify the following command:

oshell.exe --exportprj oshell_Omicsoft_Dir omicsoft_project_file omicsoft_output_folder output_annotation_boolean_value


COMMAND LINE PARAMETERS
oshell_Omicsoft_Dir

omicsoft_project_file

omicsoft_output_folder

output_annotation_boolean_value: Specify whether or not you want to have the chip annotation data exported as part of the project by specifying "TRUE" or "FALSE".