Comparison.GeneSetAnalysis

From Array Suite Wiki

Jump to: navigation, search

Contents

Overview

Gene Set Analysis is a powerful tool to help users who have their own gene set and would like to identify comparisons containing similar gene set enrichment from tens of thousands of comparisons in Land. There are two ways to perform Gene Set Analysis in Land.

Search multiple genes or pathway in Land

  • Perform a search of multiple genes or pathway:

GeneSetAnalysis06.png

  • Gene set analysis can be found under Enrichment Analysis. The analysis is based on Fisher Exact test.

GeneSetAnalysis07.png

Using Custom Gene Set

Users can create their own gene sets. Gene Sets can be imported and managed by "Manage => Genes => Manage Gene Sets"

GeneSetAnalysis01.png

A gene set should at least have a column containing gene symbols. Other related columns can be included as well. In particular, including fold-change and P-Value measurements can improve the utility of the gene sets.

The following terms are recognized as column headers in GeneSet files:

  • Fold-change (recognized Column titles include "Log2FoldChange", "FoldChange", and "Estimate")
  • Raw p-value (recognized Column titles include "RawPValue","p-value", "pvalue", and "PValue")
  • Adjusted p-value (recognized Column titles include "AdjustedPValue")
  • General p-value (recognized column titles include "GeneralPValue")

If one of these terms are found in the header columns, the data in those columns will be used in GeneSet Enrichment Analysis tests.

An example of Gene Set is shown below:

GeneSetAnalysis02.png

(Note: The table may display spaces between words in a header column to aid readability, but terms such as "FoldChange" and "RawPValue" will only be recognized as a single term, and will be preserved in the actual table object. Hover your mouse over a column header to confirm proper formatting.)

Select Gene Set Analysis under Enrichment Analysis.

GeneSetAnalysis09.png

Choose the gene set to perform analysis on

GeneSetAnalysis10.png

Statistical Tests

Different statistical test is chosen for Gene Set Analysis, depending on whether PValue and/or Fold Change values are available.

Fisher's exact test

  • If neither P-Values nor Fold Changes are provided in user's gene set, Fisher's exact test will be used for gene set analysis in Land.
  • When "Search Multiple Genes" function is used, Fisher's exact test will be used for gene set analysis in Land.
    • GeneSetAnalysis06.png
  • All genes in user's gene set will be considered significant genes.
  • For each comparison table already included in Land, significant genes are chosen by the below criteria:
    • P-Value < 0.05
    • fold change >= 1.25
    • If the number of genes passing the above filters is more than 500, top 500 genes with the smallest P-Values will be chosen as significant genes.

Wilcoxon test

  • If P-Values and/or fold changes are provided in user's gene set, Wilcoxon tests (nonparametric) will be used for gene set analysis in Land.
  • If Estimates(log2 fold changes) are provided, the values must be signed.
  • For each comparison table already included in Land, significant genes are chosen by the below criteria:
    • P-Value < 0.05
    • fold change >= 1.25
    • If the number of genes passing the above filters is more than 100, top 100 genes with the smallest P-Values will be chosen as significant genes.
  • The above significant genes will be used to divide user's gene set into significant group and insignificant group, and Wilcoxon test will be performed on these two groups.
  • If only P-Values are available in user's gene set, then P-Values will be used to perform Wilcoxon test.
  • If only fold changes are available in user's gene set, then fold changes will be used to perform Wilcoxon test.
  • If both P-Values and fold changes are available in user's gene set, Wilcoxon test will be performed independently on P-Values and fold changes, and the smaller P-Value will be picked as the final P-Value.
  • Estimates (log2 fold changes) and fold changes will return same results, as Wilcoxon test is a nonparametric test.

Gene Set Analysis Result

Gene Set Analysis(plot)

Users can select any comparisons they are interested in, and a table will be displayed under the plot to show all the details.

GeneSetAnalysis04.png

Gene Set Analysis(Table)

GeneSetAnalysis05.png