Mask heatmap rows or columns

From Array Suite Wiki

Jump to: navigation, search


Masking rows and columns of a heatmap


In Array Studio, the -Omic data Heatmap View dynamically hides rows (genes) and columns (samples), depending on the filtering status of the underlying data. However, the Hierarchical Clustering function will "mask" filtered rows/columns with grey. This can be advantageous if you want to

  1. Emphasize the scale of the data matrix
  2. Draw attention to a subset of the data

HC menu.png

[back to top]

Input Data Requirements

Masking of rows and columns can be performed on any -Omic data object, as long as its design table has a column (other than the ID Column) that uniquely labels each sample.

Microarray HierarchicalClustering NoClustering DesignTable.png

In addition, it will be helpful to have with either lists or metadata columns attached to the -Omic Data to identify the genes/samples to mask.

[back to top]

Step 1 (Optional): Create lists to pre-configure the order of rows and/or columns

Since no actual clustering will be performed using hierarchical clustering, it might be useful to pre-order the rows and columns.

In this example, a List was created for all 63,677 variable IDs (Ensembl IDs), using the alphabetical order of the associated Gene Names.

[back to top]

Step 2: Run Hierarchical Clustering (without actually clustering)

Now click MicroArray | Pattern | Hierarchical Clustering to open the Hierarchical Clustering window:

Microarray HierarchicalClustering NoClustering Window.png

  1. Select the input -Omic data
  2. (Optional) Select a custom list of variables and/or observations to control the ordering in the heatmap
  3. Leave Compute observation tree selected, and leave Compute variable tree de-selected
  4. In Observation grouping, select a column from the Design table that is unique for each sample (e.g. Observation ID). This will force no clustering to occur on the observation table (even though "Compute observation tree" is selected).

Specify an output name, then click Submit.

[back to top]

Step 3: Mask Rows/Columns of interest

The default output heatmap will show all rows and columns. If you did not specify a customized order in the hierarchical clustering, you can sort heatmap rows by metadata columns.

Microarray HierarchicalClustering NoClustering DefaultHeatmap WithColorBar.png

(As a visual aid, the color bar along the Y-axis was added to the Annotation Table, containing a column of the first letter for each gene.)

Use the View Controller Variable and Observation tabs to filter out genes and samples of interest. For example, right-click on GeneID to filter out all genes from a list (e.g. all genes starting with M-Q) with a MINUS List Filter:

Filter MinusList Menu.png

This will hide all genes in the list:

Microarray HierarchicalClustering NoClustering FilteredHeatmap WithColorBar.png

[back to top]

Step 4 (Optional): Highlight rows or columns

In addition to filtering rows/columns, certain rows/columns can be highlighted, by selecting those rows/columns in the Table View.

Step 4a: Select at least one column

Switch to the Table View, then select any columns that you want to be highlighted (at least one column must be selected, indicated by Green text):

Table SelectColumns.png

Step 4b: Select rows of interest

Then, select any rows you want highlighted. For large numbers of rows, it will be easiest to use List's Select Rows by List:

List SelectListAsRows Menu.png

In this example, a list containing Ensembl IDs for all genes starting with C-G or S-T was used to select all matching rows in the Table View.

Table SelectedRows.png

Tips.pngNote, the List selection can also be applied in the Heatmap view, but it is easier to demonstrate the underlying selection in the Table view.

Output Results

After selection, only rows/columns that were selected (manually or by list) will have a blue highlight.

Microarray HierarchicalClustering NoClustering FilteredSelectedHeatmap WithColorBar.png

[back to top]

Related applications

Advanced applications of this technique includes appending inference data to the annotation table, and filtering or selecting data, to highlight only the most significant results.

Related Articles

[back to top]