A list of compiled genome and gene model from OmicSoft

From Array Suite Wiki

Jump to: navigation, search


By default, Oshell/FusionMap/OSA will automatically download a compiled genome and gene model from our server, if they are available. They are carefully assessed by OmicSoft development team. Please check the following table before you build something your own.

Omicsoft Genome and Gene Model

In ArrayStudio, we name genome reference library as Human.B37, Human.B37.3, Human.hg19. Both Human.B37 and Human.B19 is based on NCBI build 37.1. Human.hg19 is a pure superset of Human.B37. Human.B37.3 is based on NCBI build 37.3.

Human.B37, B37.3 references and Ensembl gtf gene models are downloaded from Ensembl.

Human hg19 is downloaded from UCSC ftp and Refgene/ucsc models are downloaded from UCSC table.

The only difference between Ensembl and USCS reference libraries is the minor inconsistency on chromosome MT.

Since end of 2012, we downloaded them every three months. If there are changes, we will create a new one with date attached in the name, such as RefGene20121217.

Genome Reference Gene Model Note
Human.B37.3 OmicsoftGene20130723 .
. Ensembl.R73 .
. Ensembl.R72 .
. Ensembl.R70 .
. Ensembl.R68 .
. Ensembl.R67 .
. Ensembl.R66 .
. Ensembl.R65 .
. Ensembl.R63 .
. Ensembl.R62 .
. Ensembl.R61 .
. Ensembl.R60 .
. Ensembl.R59 .
. Ensembl.R58 .
. RefGene .
. RefGene20121217 .
. UcscGene20120907 .
. UcscGene20130723 .
. UCSCCanonicalGene20131119 .
Human.B37 Ensembl.R70 .
. Ensembl.R68 .
. Ensembl.R67 .
. Ensembl.R66 .
. Ensembl.R65 .
. Ensembl.R63 .
. Ensembl.R62 .
. Ensembl.R61 .
. Ensembl.R60 .
. Ensembl.R59 .
. Ensembl.R58 .
. RefGene .
. RefGene20121217 .
. UcscGene20120907 .
. UcscGene20130723 .
Human.B36 RefGene .
. RefGene20121217 .
. UcscGene .
. UcscGene20120628 .
. UcscGene20120907 .
. Ensembl.R54 .
Human.hg19 Ensembl.R73 .
. Ensembl.R72 .
. Ensembl.R70 .
. Ensembl.R68 .
. Ensembl.R67 .
. Ensembl.R66 .
. RefGene .
. RefGene20121217 .
. UcscGene20120907 .
. UcscGene20130723 .
. UCSCCanonicalGene20131119 .
Mouse.B37 Ensembl.R67 .
. Ensembl.R66 .
. Ensembl.R65 .
. Ensembl.R63 .
. Ensembl.R62 .
. Ensembl.R61 .
. Ensembl.R60 .
. Ensembl.R59 .
. Ensembl.R58 .
. RefGene .
. RefGene20121217 .
. UcscGene20120907 .
. UcscGene20130723 .
Mouse.B38 Ensembl.R73 .
. Ensembl.R72 .
. Ensembl.R70 .
. Ensembl.R68 .
. RefGene .
. RefGene20121217 .
. UcscGene20130503 .
. UcscGene20130723 .
Mouse.mm10 Ensembl.R73 .
. Ensembl.R72 .
. Ensembl.R70 .
. Ensembl.R68 .
. RefGene .
. RefGene20121217 .
. UcscGene20130503 .
. UcscGene20130723 .
Rat.B3.4 Ensembl.R68 .
. Ensembl.R67 .
. Ensembl.R66 .
. Ensembl.R65 .
. Ensembl.R63 .
. Ensembl.R62 .
. Ensembl.R61 .
. Ensembl.R60 .
. RefGene20121217 .
Rat.B5.0 Ensembl.R73 .
. Ensembl.R72 .
. Ensembl.R70 .
. RefGene20121217 .
Rhesus.B1.2 Ensembl.R70 .
. Ensembl.R68
. Ensembl.R67 .
. Ensembl.R66 .
. Ensembl.R65 .
. Ensembl.R63 .
. Ensembl.R62 .
. RefGene20121217 .
Rhesus.Mac2 Ensembl.R73 .
. RefGene20121217 .
Rhesus.Mac3 RefGene20130709 .
Cyno.B1.0 BgiGene20131018 .
. BgiGene20130718 .
CriGri.B1.0 BgiGene20131018 .
. BgiGene20130718 .
Dog.CanFam3 Ensembl.R73 .
. Ensembl.R72 .
. RefGene20130718 .
Pig.B10.2 Ensembl.R73 .
. Ensembl.R70 .
Rabbit.B2 Ensembl.R73 .
. Ensembl.R71 .
. RefGene20121217 .
Human.B37.3+Mouse.B37 Ensembl.R67 For mixed genome alignment, such as Xenograft samples
. Ensembl.R62 .

Ref and gene model file your machine

If you specify a genome and gene model name in your OmicScript, Oshell will download them in your OmicSoft Base_Dir folder. You will find ReferenceLibrary under the OmicSoft Base_Dir. Take Human.B37.3 and UcscGene20120907 as one example:

Base_Dir
-Temp
----sdf231g3654a23sd1f6 (randomly named folder to store temp files)
-ReferenceLibrary
---- Human.B37.3.dreflib1.mmf2
---- Human.B37.3.dreflib1
---- Human.B37.3.gindex1
---- Human.B37.3_GeneModels
--------- Human.B37.3_UcscGene20120907.dreflib2
--------- Human.B37.3_UcscGene20120907.dreflib2.mmf2
--------- Human.B37.3_UcscGene20120907.tindex1
--------- UcscGene20120907.gmodel2

It only takes a few minute to download and build an index into your local cache.

Here is a simple log for reference downloading

[12:43 PM] Downloading from http://www.omicsoft.com/downloads/dreflib/Human.B37.3.dreflib1.gzip...
[12:53 PM] Unzipping file...
[12:54 PM] Building index for the reference library. This may take up to 30 minutes but only occur once for each reference library...
[12:54 PM] Creating offset array...
[12:54 PM] Loading all sequences into memory...
[12:54 PM] Calculating counts for each 14-mer...
[12:57 PM] Calculating cumulative positions...
[12:57 PM] Calculating virtual positions...
[1:03 PM] Saving index to file C:\omicsoft\ReferenceLibrary\Human.B37.3.gindex1...
[1:03 PM] Loading gene model...
[1:03 PM] Downloading from http://www.omicsoft.com/downloads/dreflib/Human.B37.3_UcscGene20120907.gmodel2.gzip...
[1:03 PM] Unzipping file...

Note: if you manually download gene model file, such as from http://www.omicsoft.com/downloads/dreflib/Human.B37.3_UcscGene20120907.gmodel2.gzip, you have to gunzip Human.B37.3_UcscGene20120907.gmodel2.gzip file (change extension from .gzip to .gz if necessary) and rename Human.B37.3_UcscGene20120907.gmodel2 to UcscGene20120907.gmodel2.