OmicSoft Suite v12 - Oshell Installation
From Array Suite Wiki

Contents
|
Overview
Oshell.exe is a .NET application, that can also be run on Linux environment using Mono. This article gives an introduction to Oshell, its installation and wiki links to typical usages.
Oshell/OmicSoft Project Environment
Oshell environment is a project-oriented analysis environment which contains popular analysis modules for data generated from sequencing and microarray platforms. Each project in the environment is associated with its data objects and analysis modules. Comprehensive data analysis pipelines can be constructed as projects in the environment in a user-friendly fashion. Pipeline is written and executed in OmicScript format, which is a brief script specifying data objects and running parameters. Data objects can be passed on to their corresponding downstream analysis modules smoothly.
OmicSoft project is
- A collection of data objects (NGS object, Omics object, and table)
- NGS data is a collection of BAM file links. BAM file will load to software when necessary. Multiple projects can share the same BAM file.
- Omics data can be any result table combined with sample design and feature (e.g. gene) annotation, such as gene expression or CNV results.
- Table is anything like an excel table, such as sequence alignment report.
- List can be a list of IDs (e.g. gene). It can be used to filter result in Omics data and table.
- An environment for analysis
- Analysis runs on one/multiple/subset of objects
- Analysis steps/scripts are tracked
- An entity sharable on the server
Installation
Based on direct implementation of all its analysis modules, Oshell environment can be installed and run without dependencies on other bioinformatics software.
Please not that Oshell v12 is not compatible with OmicSoft Server 11.7 or earlier.
Install Oshell v12 on Windows
Oshell is coded in C#, and Windows .Net is its native running platform. Users can install Oshell very easily:
- Create a folder with name "Oshell"
- Download and save https://resources.omicsoft.com/software_update/OmicSoftServiceUpdater.exe to "Oshell" folder
- In "Oshell" folder, create an empty file with name oshell.exe [note: the file extension is .exe]
- Double click OmicSoftServiceUpdater.exe and all software binaries will be automatically downloaded into "Oshell" folders
- Supported operating system: Windows 10
Install Oshell v12 on Ubuntu20
Install Mono
Install Mono 6.12 from the official repository for Ubuntu (https://www.mono-project.com/download/stable/#download-lin-ubuntu). Installing Mono by compiling it from sources is no longer necessary.
Add the Mono repository to your system:
$ sudo apt-get install gnupg ca-certificates $ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF $ echo "deb https://download.mono-project.com/repo/ubuntu stable-focal main" | sudo tee /etc/apt/sources.list.d/mono-official-stable.list $ sudo apt-get update
Install Mono 6 from repository:
$ sudo apt-get install mono-complete $ which mono /usr/bin/mono $ mono --version Mono JIT compiler version 6.12.0.107 (tarball Wed Dec 9 21:44:58 UTC 2020) ...
Create symlinks for the Mono 6 directories (this is needed for backward compatibility):
$ sudo mkdir /opt/mono-6.12.0 $ sudo mkdir /opt/mono-6.12.0/bin $ sudo ln -s /usr/bin/mono /opt/mono-6.12.0/bin/mono $ sudo ln -s /usr/bin/mono-sgen /opt/mono-6.12.0/bin/mono-sgen $ sudo ln -s /usr/bin/cert-sync /opt/mono-6.12.0/bin/cert-sync $ sudo ln -s /usr/bin/certmgr /opt/mono-6.12.0/bin/certmgr
Install zlib-dev
sudo apt-get install zlib1g-dev
Install Oshell
Create Oshell installation directory
$ sudo mkdir /opt/oshell $ cd /opt/oshell $ sudo wget -c https://resources.omicsoft.com/software_update/OmicSoftServiceUpdater.exe $ sudo touch oshell.exe
To run OmicSoft Server as a non-privileged user (ubuntu, not root), that user must be made owner of all OmicSoft-related folders:
$ sudo chown -R ubuntu:ubuntu /opt/oshell/
Run Omicsoft Service Updater
$ mono ./OmicSoftServiceUpdater.exe
Check Oshell was installed successfully
$ cd /opt/oshell $ mono ./oshell.exe --version OShell version=12.1.X.X
Install Oshell on Amazon Linux 2022
Amazon Linux 2022 is the next generation of Amazon Linux from AWS. It is still in preview, see https://docs.aws.amazon.com/linux/al2022/ug/what-is-amazon-linux.html for more details.
Install Mono
Install Mono 6.12 from the official repository for CentOS/RHEL 7 (https://www.mono-project.com/download/stable/#download-lin-centos). Installing Mono by compiling it from sources is no longer necessary.
Add the Mono repository to your system:
$ sudo rpmkeys --import "http://keyserver.ubuntu.com/pks/lookup?op=get&search=0x3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF" $ sudo su -c 'curl https://download.mono-project.com/repo/centos7-stable.repo | tee /etc/yum.repos.d/mono-centos7-stable.repo'
Install Mono 6 from repository:
$ sudo yum install mono-complete $ which mono /usr/bin/mono $ mono --version Mono JIT compiler version 6.12.0.107 (tarball Wed Dec 9 21:44:58 UTC 2020) ...
Create symlinks for the Mono 6 directories (this is needed for backward compatibility):
$ sudo mkdir /opt/mono-6.12.0 $ sudo mkdir /opt/mono-6.12.0/bin $ sudo ln -s /usr/bin/mono /opt/mono-6.12.0/bin/mono $ sudo ln -s /usr/bin/mono-sgen /opt/mono-6.12.0/bin/mono-sgen $ sudo ln -s /usr/bin/cert-sync /opt/mono-6.12.0/bin/cert-sync $ sudo ln -s /usr/bin/certmgr /opt/mono-6.12.0/bin/certmgr
Install Oshell
Create Oshell installation directory
$ sudo mkdir /opt/oshell $ cd /opt/oshell $ sudo wget -c https://resources.omicsoft.com/software_update/OmicSoftServiceUpdater.exe $ sudo touch oshell.exe
To run OmicSoft Server as a non-privileged user (ec2-user, not root), that user must be made owner of all OmicSoft-related folders:
$ sudo chown -R ec2-user:ec2-user /opt/oshell/
Run Omicsoft Service Updater
$ mono ./OmicSoftServiceUpdater.exe
Check Oshell was installed successfully
$ cd /opt/oshell $ mono ./oshell.exe --version OShell version=12.1.X.X
To install Oshell for older Amazon Linux 2 see below topics in Amazon Linux 2 Kernel 5.10 Array Server AMI Setup Notes:
- Install Mono 6.12.0.122
- Add appropriate SSL certificates
- Install Oshell
Install OShell on MacOS
Oshell is not officially supported on MacOS.
Getting Started
Check Oshell Version
Get Oshell version
$ mono oshell.exe
You will get something like:
-------------------------------------------------------------------------------- Version: 12.1.0.10 Analysis mode not specified --------------------------------------------------------------------------------
Keep updated
User can always update Oshell to our latest development using OmicSoftServiceUpdater.
$ mono OmicSoftServiceUpdater.exe
Run OmicScript in Oshell
If you have an OmicScript ready, it can be executed by
mono oshell.exe --runscript Base_Dir Script_path Temp_Dir Mono_Path > PathToRun.log
where
- Base_Dir is the path to Oshell base directory where the ReferenceLibrary folder should be located, e.g.
/opt/omicsoft
Note, this is equivalent to the OmicsoftDirectory in ArrayServer.cfg
- Script_path is the path to the oshell script, e.g.
/opt/omicsoft/test/run.oscript
- Temp_Dir is the path to a directory storing temporary files, e.g.
/scratch
- Mono_Path is the path to the mono so that Oshell will remember during the run, e.g.
/opt/omicsoft/mono/mono
- PathToRun.log is the path to the log file recording all logs, e.g.
/opt/omicsoft/test/run.oscript.log
Note: The mono command is not required in Windows OS.
If running on a machine with Array Studio or ArrayServer, BaseDir and TempDir can use existing directories (i.e. no need to specify a second BaseDirectory for oshell to hold separate genome references/gene models etc).
In the section below, we will provide more details about How to write OmicScript.
Build genome reference index and gene model
In most of NGS functions, Oshell requires the user to have a reference genome and a gene model built prior to running the actual functions. The indexing needs to be generated only once for each reference. By default, when it is the first time to run jobs using certain reference and gene model, the program will automatically download a compiled genome and gene model.
User has to specifies the right name for the reference genome and gene model. See A list of compiled genome and gene model from OmicSoft. For example, if we run alignment detection with Human.B37.3 and RefGene model using the OmicScript for Alignment. It will download the Human.B37.3 and RefGene model in your local folder. You will find folders under the Base_Dir:
Base_Dir --ReferenceLibrary ---- Human.B37.3.dreflib1 ---- Human.B37.3.gindex1 ---- Human.B37.3_GeneModels ---- ---- RefGene.gmodel2
Users can choose to build their own reference library, it is recommended to use Oshell --runscript with OmicScript functions: BuildReferenceLibrary and BuildGeneModel, see example below.
If users want to use the command line directly, please read Build Reference Library and Gene Model through Oshell subcommand.
OmicScript
If you have ArrayStudio software, please read Generate and run OmicScript in ArrayStudio GUI. Other users can write OmicScript based on our OmicScript Collection. We will provide some examples below.
OmicScript to build reference index and gene model
Begin BuildReferenceLibrary /Namespace=NgsLib; Reference Reference_library_id; Files "/pathToFile/reference.fa"; Options /cDNA=False /ReverseComplement=False /Build64BitIndex=True /Build32BitIndex=False /Species=Unspecified /NcbiBuild=1.0; End; Begin BuildGeneModel /Namespace=NgsLib; Reference Reference_library_id; GeneModel Gene_model_id; Files "/pathToFile/genemodel.gtf"; Options /AppendChr=False /BuildGeneLevelAnnotation=True /BuildTranscriptLevelAnnotation=True; End;
Save above script into buildIndex.oscript and run the script using
mono oshell.exe --runscript Base_Dir Script_path/buildIndex.oscript Temp_Dir Mono_Path
OmicScript for OmicSoft Alignment
Details about OmicSoft Aligner (OSA) are in the following publication:
We have migrated the OSA to Oshell environment. Because Oshell is a project-based environment as described at the top of this page, the RNA-Seq alignment function MapRnaSeqReadsToGenome has to be wrapped by NewProject (create the environment) and SaveProject, CloseProject (closes the environment). This will create a project in which the alignment will be performed and where output files will be managed:
Begin NewProject; File "/test/omicsoft/AlignmentProject.osprj"; Options /Distributed=True; End; Begin MapRnaSeqReadsToGenome /Namespace=NgsLib; Files "/pathToFile/SampleA_1.fastq.gz /pathToFile/SampleA_2.fastq.gz /pathToFile/SampleB_1.fastq.gz /pathToFile/SampleB_2.fastq.gz"; Reference Human.B37.3; GeneModel RefGene; Trimming /Mode=TrimByQuality /ReadTrimQuality=2; Options /ParallelJobNumber=2 /PairedEnd=True /FileFormat=AUTO /AutoPenalty=True /FixedPenalty=2 /Greedy=false /IndelPenalty=2 /DetectIndels=False /MaxMiddleInsertionSize=10 /MaxMiddleDeletionSize=10 /MaxEndInsertionSize=10 /MaxEndDeletionSize=10 /MinDistalEndSize=3 /ExcludeNonUniqueMapping=False /ReportCutoff=10 /WriteReadsInSeparateFiles=True /OutputFolder="/test/omicsoft/AlignmentProject/BAMOutput" /GenerateSamFiles=False /ThreadNumber=6 /InsertSizeStandardDeviation=40 /ExpectedInsertSize=300 /InsertOnSameStrand=False /InsertOnDifferentStrand=True /QualityEncoding=Automatic /CompressionMethod=Gzip /Gzip=True /SearchNovelExonJunction=True /ExcludeUnmappedInBam=False; Output Alignment; End; Begin SaveProject; Project AlignmentProject; File "/test/omicsoft/AlignmentProject.osprj"; End; Begin CloseProject; Project AlignmentProject; End;
Save above script into Alignment.oscript and run the script using
mono oshell.exe --runscript Base_Dir Script_path/Alignment.oscript Temp_Dir Mono_Path
When Oshell is run in standalone mode on a single workstation, multiple alignment or summary jobs are automatically spawned off so that each job occupies one process using multiple threads. Here /ParallelJobNumber=2 /ThreadNumber=6
, two samples will run simultaneously, each will use 6 threads.
For details about each parameters, please read articles: MapRnaSeqReadsToGenome, NewProject, SaveProject and CloseProject.
OmicScript for FusionMap
Details about FusionMap are in the following publication:
We have migrated the FusionMap to Oshell environment, with the MapFusionReads function. Because Oshell is a project-based environment as described at the top of this page, the MapFusion Reads function has to be wrapped by NewProject (create the environment) and SaveProject, CloseProject (closes the environment). This will create a project in which the alignment will be performed and where output reports will be managed:
Begin NewProject; File "/test/omicsoft/FusionDetection.osprj"; Options /Distributed=True; End; Begin MapFusionReads /Namespace=NgsLib; Files "/pathToData/Illumina.Paired.1.fastq.gz /pathToData/Illumina.Paired.2.fastq.gz"; Reference Human.B37.3; GeneModel RefGene; Trimming /Mode=TrimByQuality /ReadTrimQuality=2; Options /FusionVersion=2 /ParallelJobNumber=4 /PairedEnd=False /RnaMode=True /FileFormat=BAM /AutoPenalty=True /FixedPenalty=2 /OutputFolder="/ouput/xxxx" /MaxMiddleInsertionSize= /ThreadNumber=2 /QualityEncoding=Automatic /CompressionMethod=None /Gzip=False /FilterUnlikelyFusionReads=False /FullLengthPenaltyProportion=8 /OutputFusionReads=True /MinimalHit=4 /MinimalFusionAlignmentLength=0 /MinimalFusionSpan=0 /FusionReportCutoff=1 /ReportUnannotatedFusion=False /NonCanonicalSpliceJunctionPenalty=2 /RealignToGenome=True; Output FusionDetection; End; Begin ExportView; Project FusionDetection; OutputFolder "/test/omicsoft/FusionDetection/Results"; End; Begin SaveProject; Project FusionDetection; File "/test/omicsoft/FusionDetection.osprj"; End; Begin CloseProject; Project FusionDetection; End;
Also Read:
OmicScript pipeline for RNA-Seq data analysis
Please read OmicScript pipeline for RNA-Seq data analysis, the pipeline includes the alignment, fusion detection, mutation detection and many other steps.
OmicScript pipeline for DNA-Seq data analysis
Please read OmicScript pipeline for DNA-Seq data analysis
Deploy Oshell in Cluster
Use build-in scheduler
When Oshell is run in cluster mode on a grid engine, each job occupies one spot (one or more slots based on the thread number setting and cluster queue setting). The built-in scheduling system supports both SGE and PBS which can accelerate the analysis of tremendous amount of RNA-Seq data.
Oshell uses SetEnvironment function to set up the cluster for Oshell jobs. Here is one example of OmicScript which will schedule jobs to cluster, monitor the process of each job, handle running logs from multiple jobs, summarize jobs outputs into one Oshell project.
Example OmicScript running on SGE
#Enable cluster Begin SetEnvironment; Cluster /EnableCluster=True /ClusterAlignmentPath="/Oshell/ClusterAlignment.sh" /ClusterSummaryPath="/Oshell/ClusterSummary.sh" /ClusterParallelEnvironment=peomics /ClusterParallelRatioFactor=1 /ClusterQueueName=all.q /ClusterGridEngine=SGE /DefaultClusterJobNumber=12 End; #Create the Oshell project environment Begin NewProject; File "/test/AlignmentTest/OshellClusterTest.osprj"; Options /Distributed=true; End; #Alignment Begin MapRnaSeqReadsToGenome /Namespace=NgsLib; Files " /TestDataSets/HumanRNASeqPaired/SRR327893.subset.1.fastq.gz /TestDataSets/HumanRNASeqPaired/SRR327893.subset.2.fastq.gz /TestDataSets/HumanRNASeqPaired/SRR065521.subset.1.fastq.gz /TestDataSets/HumanRNASeqPaired/SRR065521.subset.2.fastq.gz /TestDataSets/HumanRNASeqPaired/simulationread200PE_1.fastq.gz /TestDataSets/HumanRNASeqPaired/simulationread200PE_2.fastq.gz /TestDataSets/HumanRNASeqPaired/simulationread400PE_1.fastq.gz /TestDataSets/HumanRNASeqPaired/simulationread400PE_2.fastq.gz "; Reference Human.B37.3; GeneModel RefGene; Trimming /Mode=TrimByQuality /ReadTrimQuality=2; Options /ParallelJobNumber=4 /PairedEnd=True /FileFormat=AUTO /AutoPenalty=True /FixedPenalty=2 /Greedy=false /IndelPenalty=2 /DetectIndels=False /MaxMiddleInsertionSize=10 /MaxMiddleDeletionSize=10 /MaxEndInsertionSize=10 /MaxEndDeletionSize=10 /MinDistalEndSize=3 /ExcludeNonUniqueMapping=False /ReportCutoff=10 /WriteReadsInSeparateFiles=True /OutputFolder="/test/AlignmentTest/OshellClusterTest/BAMFiles" /GenerateSamFiles=False /ThreadNumberPerJob=4 /InsertSizeStandardDeviation=40 /ExpectedInsertSize=300 /InsertOnSameStrand=False /InsertOnDifferentStrand=True /QualityEncoding=Automatic /CompressionMethod=Gzip /Gzip=True /SearchNovelExonJunction=True /ExcludeUnmappedInBam=False; Output primary_alignment; End; # save OmicSoft project enviroment Begin SaveProject; Project OshellClusterTest; File "/test/AlignmentTest/OshellClusterTest.osprj"; End; # close Oshell project enviroment Begin CloseProject; Project OshellClusterTest; End;
Also Reads: SetEnvironment, ClusterAlignmentPath and ClusterSummaryPath.
Wrap Oshell to cluster jobs
User can also wrap Oshell jobs in qsub script, such as the one below for SGE. It gives users greater controls on job submission since the default job scheduler using SetEnvironment has limited options. Users do not have to SetEnvironment in Oscript using this method.
#!/bin/bash # # SGE submission options #$ -q all.q # Select the queue #$ -o /home/ge/job.o #$ -e /home/ge/job.e #$ -N test # A name for the job #$ -pe smp 1 # Select the parallel environment # Run Oshell projects MONO=/[path where mono was installed]/bin/mono OSHELL=/App/omicsoft/Oshell/oshell.exe BASEDIR=/App/omicsoft TMP=/scratch OSCRIPT=/App/Oscirpt/runpipeline.oscript LOG=/App/Oscirpt/runpipeline.log "$MONO" "$OSHELL" --runscript "$BASEDIR" "$OSCRIPT" "$TMP" "$MONO" > "$LOG"
Oshell subcommand
In the previous version, Oshell provides individual subcommand to run each function, such as
- oshell.exe --buildref to build reference
- oshell.exe --buildgm to build gene model
- oshell.exe --alignrna to do RNA-Seq alignment
- oshell.exe --semap to do fusion alignment
- For more, please read Oshell subcommand
We have completely migrated the Oshell to work in environment setting as described in this article. The development of these subcommands has been discontinued. We only support these subcommands through the end of year 2013.
Oshell Land R API
The Land R API functions are provided to users who want to query land data using R. For more informations please see: Land_R_API_with_Omicsoft_v12.
License
Commercial users: please contact bioinformaticssales@qiagen.com to get a license.
Publication
RNA-Seq Analysis Pipeline Based on Oshell Environment
Citation
@null{6808521, author={Li, J. and Hu, J. and Newman, M. and Liu, K. and Ge, H.}, journal={Computational Biology and Bioinformatics, IEEE/ACM Transactions on}, title={RNA-Seq Analysis Pipeline Based on Oshell Environment}, year={2014}, month={}, volume={PP}, number={99}, pages={1-1}, doi={10.1109/TCBB.2014.2321156}, ISSN={1545-5963},}