Run DESeq2 RScript

From Array Suite Wiki

Jump to: navigation, search


Contents

Introduction

ArrayStudio users are often interested in latest version of DESeq2 package, like v1.20.*. Currently, Omicsoft implemented DESeq2 v1.10.1 to ArrayStudio, to use the latest version of DESeq2 package, users could put the demo DESeq2R_SingleColumn.rscript to **Documents\Omicsoft\RScripts\MicroArray\ folder, and run RScript from the GUI. By using this RScript, users can get identical results as from RStudio.

This page presents an ambitious Escript that will automatically read count and design from -OmicData and -NGSData from ArrayStudio GUI, and perform DESeq2 analysis.

Input Dataset

For this demonstration, we will use a test dataset GSE6976, which was randomly downloaded from GEO.

Write the RScript, and put it to the target directory

For local project, put the *.RScript file to C:\Users\WINUSER\Documents\Omicsoft\RScripts\MicroArray\

Save rscript.png

After that, the Script would be available under -OmicData -> R -> Ru RScript (MicroArray Data)

Run RScript.png

In general, this R Script for Array Studio project needs to be saved in:

  • For local analysis, the .rscript files need to be put in OmicsoftHomeDirectory\RScripts\MicroArray by default.
  • For server analysis, the .rscript files need to be put in BaseDirectory\Pipeline\RScripts\MicroArray by default.

Run the RScript

Workflow of the R Script

The rscript can be found at the bottom of the page. This section will describe how to run the script.

Parameters setup

Within this PScript, there are three parameters to setup, GroupColumn, CompareTo, and CCO (CooksCutOff).

  1. GroupColumn, users could specify the column in the design table to perform the DESeq2 test.
  2. CompareTo, by default DESeq2 will pair each sub group with the first one in the selected column. As in this case, there are 4 levels of factors in the Treatment column, DESeq2 will pair each with the first one (50 mM KCL treatment for 1.5 hours), so users could set CompareTo as the target factor. in the screenshot, "untreated" was set as the target factor.
  3. CCO, cooksCutOff value, there might be some outliers (either for the whole sample or just for several genes of a particular sample), set the CCO=FALSE, DESeq2 will perform the test on outliers, raw p-values and adjust p-values will be available in the output table.

DESeq2 parameter setup.png

When the job is done, a table with sample baseMean, log2foldChange, raw p-value, and adjust p-value, will show up in the local project.

Deseq2 local job done.png

The RScript

<Info>
Description    Perform DESeq2 GLM analysis, with or without matched-pair or batch information
Author         xxx xx
Created        05/09/2018
Requirement    Bioconductor version 3.6, or above; DESeq2 version 1.18 or later
 
<Input>
GroupColumn =
CompareTo = 
## cookscutoff TRUE or False
CCO = FALSE
 
PairColumn = none
 
<Output>
DESeq2.R.output
 
 
 
<Script>
dat <- input.data
des <- input.design
 
GroupColumn = input.parameters[["GroupColumn"]]
CCO = input.parameters[["CCO"]]
CompareTo = input.parameters[["CompareTo"]]
PairColumn = input.parameters[["PairColumn"]]
 
print(GroupColumn)
print(CompareTo)
print(CCO)
 
## clean the rownames
rownames(des) = gsub("[^a-zA-Z0-9_.]","_",rownames(des))
rownames(des) = make.names(rownames(des),unique=TRUE)
 
## only keep rows passed from the data table to the rscript
## for example when applying a filter to the count table, there will be fewer rows;
des = des[colnames(dat),]
 
## check if DESeq2 has been installed
if(suppressWarnings(!(require(DESeq2)))){
 cat("DESeq2 pkg is not found; installing bioconductor package DESeq2...")
 source("http://bioconductor.org/biocLite.R")
 biocLite("DESeq2")
 require(DESeq2)
}
 
 
## parse cco from "FALSE" string into boolean
CCO <- eval(parse(text=CCO))
 
## check if column names and group column could be found
if(!(GroupColumn%in%colnames(des))) stop("The grouping column (case-sensitive) NOT found! ","Column names available: ", colnames(des))
## if(!(GroupColumnB%in%colnames(des))) stop("The grouping column (case-sensitive) NOT found! ","Column names available: ", colnames(des))
if(!(CompareTo%in%des[,GroupColumn])) stop("The CompareTo value (case-sensitive) NOT found! ","Possible values are: ",  unique(des[,GroupColumn]))
 
 
designColA <- des[ ,GroupColumn]
#designColB <- des[ ,GroupColumnB]
#designColA <- factor( designColA )
#designColB <- factor( designColB )
design <- data.frame(designColA)
 
#des[ , GroupColumn] <- factor( des[ ,GroupColumn] ) 
 
## levels() works on factor( col )
 
levels <- levels( factor( designColA ) )
 
design$designColA <- factor(design$designColA) 
 
dds <- DESeqDataSetFromMatrix(round(dat), design, ~designColA)
dds <- DESeq(dds, minReplicatesForReplace=7, modelMatrixType="standard")
 
resultsNames(dds)
 
 
## initial res as the result from DESeq2 dds
res <- NULL
 
for( level in levels){
 
	if( level != CompareTo) {
 
		contrast <- c("designColA", level, CompareTo)
 
		    if( is.null(res)){
 
				res <- results(dds, contrast, cooksCutoff = CCO)
 
				## rename the colnames of current dataframe
				colnames(res) <- paste(GroupColumn, level,"vs", CompareTo, colnames(res), sep = "_")
 
			} else {
 
				res2 <- results(dds, contrast, cooksCutoff = CCO)
				colnames(res2) <- paste(GroupColumn, level, "vs", CompareTo, colnames(res2), sep = "_")
				res <- cbind(res, res2)
			}
 
	print( head( res ) )
	}
}
 
 
#res <- results(dds, cooksCutoff=CCO)
 
DESeq2.R.output <- res
[back to top]