Getting Started

From Array Suite Wiki

(Difference between revisions)
Jump to: navigation, search
(Configuration Files)
Line 65: Line 65:
 
**[[TempDirectory]]: This should be a local directory (i.e. NOT a network drive) for fast read and write access. It can take twice the size of an unzipped fastq file in some NGS tasks (we would suggest use a drive with at least 100GB storage).
 
**[[TempDirectory]]: This should be a local directory (i.e. NOT a network drive) for fast read and write access. It can take twice the size of an unzipped fastq file in some NGS tasks (we would suggest use a drive with at least 100GB storage).
 
**The [[ArrayServer.cfg#Folder| Folder]] section defines additional local or network folders monitored and available for access by array server users.
 
**The [[ArrayServer.cfg#Folder| Folder]] section defines additional local or network folders monitored and available for access by array server users.
** For a full list of options, see [[ArrayServer.cfg]].
+
** For a full list of options, see [[ArrayServer.cfg]] for more details.
 
** For master-analytic server setup, please read [[Master Server and Analytic Server]] for more details.
 
** For master-analytic server setup, please read [[Master Server and Analytic Server]] for more details.
*[[default.template]]:  
+
*[[default.template]]: it defines project level meta data, such as requiring project title, and list organism. Admin can customized the project meta based on this template file and even enforce controlled vocabulary.
*[[sample.template]]
+
*[[sample.template]]: it defines sample level meta data, such as requiring users to fill organism and tissue for each samples in [[Sample Registration File|sample registration]]. Admin can customized the sample meta based on this template file and even enforce controlled vocabulary.
 
+
  
 
<html><div align="right"><a href="#top"><font size="1" color="grey">[back to top]</font></a></div></html>
 
<html><div align="right"><a href="#top"><font size="1" color="grey">[back to top]</font></a></div></html>
===== ArrayServer.cfg =====
 
:This file is required if this is a single server setup (i.e. no analytic server attached to this server), or the current server is used as a master server.   
 
:A sample [http://www.arrayserver.com/wiki/images/4/44/ArrayServer.cfg.20140404.txt ArrayServer.cfg] for a '''single''' server setup. 
 
:A sample [http://www.arrayserver.com/wiki/images/a/a3/ArrayServer.cfg.txt ArrayServer.cfg] for a '''master''' server setup. In this example, a network drive folder has been mounted at the same path "/media/IData" on '''both''' the master and analytic servers.
 
 
:'''Note:''' If you right click to download the sample files, it should be named as ArrayServer.cfg (not .txt).
 
 
:A few key options are :
 
::[[BaseDirectory]]
 
:::This will be the working directory of the array server, storing all the raw and prossessed data. Depending on the projects, it can take huge amount of disk space.
 
::[[TempDirectory]]
 
:::This should be a local directory (i.e. NOT a network drive) for fast read and write access. It can take twice the size of an unzipped fastq file in some NGS tasks (we would suggest use a drive with at least 100GB storage).
 
::[[OmicsoftDirectory]]
 
:::This directory can sit locally or on a network drive. All the reference genomes, gene models, Affymetrix CDF files, log files, etc. are stored in this folder.
 
 
:The [[ArrayServer.cfg#Folder| Folder]] section defines additional local or network folders monitored and available for access by array server users.
 
:For a full list of options, see [[ArrayServer.cfg]].
 
 
=====AnalyticServer.cfg=====
 
:This file is required if the current server is an analytic server. '''Note''' that if you right click to download the sample files, it should be named as AnalyticServer.cfg.
 
:    A sample [http://www.arrayserver.com/wiki/images/1/1e/AnalyticServer.cfg.txt AnalyticServer.cfg] for an '''analytic''' server setup.
 
:    Please follow the similar guideline above in setting the BaseDirectory and other related directory options.
 
 
=====[[default.template]] and [[sample.template]]=====
 
A sample [http://www.arrayserver.com/wiki/images/1/13/Sample.template.20140404.txt sample.template] file. 
 
 
A sample [http://www.arrayserver.com/wiki/images/e/ed/Default.template.20140404.txt default.template] file.
 
  
 
==== Start Running Array Server ====
 
==== Start Running Array Server ====

Revision as of 15:24, 20 January 2016

Contents

Install Array Studio on Windows

Windows prerequisites: Microsoft .NET 3.5

From Internet Explorer (IE) browser, open the following link:

For first time users, see How to activate your Array Studio.

[back to top]

Install Array Server On Linux

Required Linux packages

Make sure you have some basic packages installed:

    yum install gcc gcc-c++ bison pkgconfig libtool libstdc++-devel \
       glib2-devel gettext make freetype-devel fontconfig-devel \
       libXft-devel libpng-devel libjpeg-devel libtiff-devel giflib-devel \
       ghostscript-devel libexif-devel libX11-devel

[back to top]

Sqlite

Sqlite can be downloaded from: sqlite download. The tar.gz file is recommended for installation.

wget -c http://www.sqlite.org/sqlite-autoconf-3071401.tar.gz
tar zxvf sqlite-autoconf-3071401.tar.gz
sudo mv sqlite-autoconf-3071401 sqlite
cd sqlite
./configure --prefix=/opt/sqlite
make
make install

sqlite3  --version

[back to top]

Libgdiplus

The mono libgdiplus package must be installed (either using yum, apt-get, or installing from source at libgdiplus). In addition, the mono config file must be edited to explicitly specify the location of libgdiplus.so.

cd /opt
wget http://download.mono-project.com/sources/libgdiplus/libgdiplus-2.10.tar.bz2
tar  jxvf   libgdiplus-2.10.tar.bz2
cd  /opt/libgdiplus-2.10
./configure  --prefix=/opt/libgdiplus-2.10
make
make  install

If installing from source, you may need to add the "/libgdiplusPrefix/lib" to the shared library search paths, and check to make sure libgdiplus library is on the shared library search paths. To check, type

 $ ldconfig -p | grep libgdiplus
     libgdiplus.so (libc6,x86-64) => /opt/libgdiplus-2.10/lib/libgdiplus.so

Here the libgdiplus is installed at "/opt/libgdiplus" and has been set correctly. If not, one way to add it to the shared library path is by doing this (with root privilege),

 echo "/opt/libgdiplus/lib" > /etc/ld.so.conf.d/libgdiplus.conf
 ldconfig

If installing from yum, it may be necessary to additionally install the following:

yum install libgdiplus
yum install libungif libungif-devel


Edit mono config to point to libgdiplus

To connect mono and libgdiplus, modify the following config file: /MonoPrefix/etc/mono/config: Add the following line at the end of the file before </configuration>

<dllmap dll="gdiplus.dll" target="/opt/libgdiplus-2.10/lib/libgdiplus.so"/>

Further details can be found here, including a description of the error message if this is improperly performed.

[back to top]

Setting ulimit

Make sure the ulimit for "max user processes" and "open files" are set to the max value: 65536. You can check the values by typing: ulimit -a.

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515184
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 65536
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Modify ulimit in two config files following ulimit setup wiki.


Also read Setting up ulimit for ArrayServer.

[back to top]

Mono

Tips.pngFor Linux kernels prior to 3.10.0-327 we recommended to use mono 2.10.9 in Linux. For newer kernels, mono4.0.4 is required.
Install mono2
  • Download Mono 2.10.9
    cd /opt
    wget -c  http://origin-download.mono-project.com/sources/mono/mono-2.10.9.tar.bz2

The bz2 file can be saved to a temporary location, e.g. ~/temp/

  • Extract and modify certificate if necessary
    tar  jxvf   mono-2.10.9.tar.bz2
    cd  /opt/mono-2.10.9

For mono 2.10.9, it is recommended to modify X509Certificate to the latest standard.

  • Compile and install. On the command line, type
    cd  /opt/mono-2.10.9
    ./configure  --prefix=/opt/mono-2.10.9 --with-large-heap=yes
    make
    make  install

Note:

  • The location of mono installed is set by the option "--prefix" in the configure step, which can be changed to another location.
  • The option --with-large-heap=yes is to enable support for GC heaps larger than 3gb, which is required for NGS alignment, as well as some Array Server functions)

Double check mono installation and version

ls /opt/mono-2.10.9/bin/mono* -all
/opt/mono-2.10.9/bin/mono --version
/opt/mono-2.10.9/bin/mono-sgen --version


[back to top]

Install Array Server

Download Array Server

Assume we would like to install the Array Server under the directory /opt

On command line, type
     mkdir /opt/array-server
     cd    /opt/array-server
     wget -c  http://omicsoft.com/software_update/OmicsoftUpdater.exe
Next, create an empty file named ArrayServerLinuxBeta.exe by typing
     touch ArrayServerLinuxBeta.exe
Then, type
     cd /opt/array-server
     /opt/mono/bin/mono  ./OmicsoftUpdater.exe

[back to top]

Configuration Files

Download configuration template

cd /opt/arrayserver
wget http://omicsoft.com/software/ArrayServer/ArrayServerConfigTemplate.zip
unzip ArrayServerConfigTemplate.zip


Admin will have to modify three important configuration files based on downloaded template:

  • ArrayServer.cfg: this is key configuration file. A few important options are :
    • Port, Port2 and Port3 define the port number for data communication between ArrayStudio client and ArrayServer
    • BaseDirectory: this will be the working directory of the array server, storing all the raw and possessed data. Depending on the projects, it can take huge amount of disk space.
    • OmicsoftDirectory: This directory can sit locally or on a network drive. All the reference genomes, gene models, Affymetrix CDF files, log files, etc. are stored in this folder.
    • TempDirectory: This should be a local directory (i.e. NOT a network drive) for fast read and write access. It can take twice the size of an unzipped fastq file in some NGS tasks (we would suggest use a drive with at least 100GB storage).
    • The Folder section defines additional local or network folders monitored and available for access by array server users.
    • For a full list of options, see ArrayServer.cfg for more details.
    • For master-analytic server setup, please read Master Server and Analytic Server for more details.
  • default.template: it defines project level meta data, such as requiring project title, and list organism. Admin can customized the project meta based on this template file and even enforce controlled vocabulary.
  • sample.template: it defines sample level meta data, such as requiring users to fill organism and tissue for each samples in sample registration. Admin can customized the sample meta based on this template file and even enforce controlled vocabulary.

[back to top]

Start Running Array Server

One can start the server by using the run-omicsoft.sh script (recommended), or start it directly as this.
For single server setup, run
     export PATH=/opt/mono-2.10.9/bin:$PATH
     cd /opt/array-server
     mono-sgen  ArrayServerLinuxBeta.exe > log &

Also read Typical way to update/restart ArrayServer

For master-analytic server setup, start the master server first and then the analytic server by running the same command above (See also Master and Analytic Servers).
Connect the server (through Array Studio or Array Viewer) by typing in the server (or master server) address (e.g., tcp://192.168.1.103:8065). In some cases, if you have problems in connecting to the server, please check the firewall settings on your computer.

ConnectServer.png

NOTE: For additional information on server administration, such as stopping or updating array server, see Running Array Server in Linux

Install Array Server On Windows

Running Array Server in Windows


Test Your Installation

Add this small test bam file: Illumina.Paired.bam to genome browser Human37.3 (for instructions to create new genome browser, see New Browser).


Install OShell

Install OShell on Linux

Assume mono (version 2.10.9 preferred) has been installed in /opt/mono-2.10.9/ (see mono installation help). Here we install oshell in the home directory.
      mkdir ~/oshell
      cd ~/oshell
      wget -c  http://omicsoft.com/software_update/OmicsoftUpdater.exe
      touch oshell.exe
      /opt/mono-2.10.9/bin/mono OmicsoftUpdater.exe
One way to run oshell is as this (Usually you would wrap the commands below in a bash script).
      export PATH=/opt/mono-2.10.9/bin:$PATH
      export LD_LIBRARY_PATH=/opt/mono-2.10.9/lib:$LD_LIBRARY_PATH
      mono ~/oshell/oshell.exe --runscript   OmicSoftBaseDir  MyOShellScript  MyTempDir

Running OShell

OShell Script Example

example.rna.pe.oscript: Oshell script for Paired end RNASeq alignment.

Note that one has to modify this file by setting the correct input and output file locations.

One can run the script as this:

     mono ~/oshell/oshell.exe  --runscript  ~/OmicsoftHome  example.rna.pe.oscript   ~/temp

OShell Documentation

For details about OShell, see OShell User Guide.

Tips: OShell scripts can be obtained by clicking "Show Script" button in Array Studio (right after the parameter specification).