Mask heatmap rows or columns
From Array Suite Wiki
Masking rows and columns of a heatmap
In Array Studio, the -Omic data Heatmap View dynamically hides rows (genes) and columns (samples), depending on the filtering status of the underlying data. However, the Hierarchical Clustering function will "mask" filtered rows/columns with grey. This can be advantageous if you want to
- Emphasize the scale of the data matrix
- Draw attention to a subset of the data
Input Data Requirements
Masking of rows and columns can be performed on any -Omic data object, as long as its design table has a column (other than the ID Column) that uniquely labels each sample.
In addition, it will be helpful to have with either lists or metadata columns attached to the -Omic Data to identify the genes/samples to mask.
Step 1 (Optional): Create lists to pre-configure the order of rows and/or columns
Since no actual clustering will be performed using hierarchical clustering, it might be useful to pre-order the rows and columns.
In this example, a List was created for all 63,677 variable IDs (Ensembl IDs), using the alphabetical order of the associated Gene Names.
Step 2: Run Hierarchical Clustering (without actually clustering)
Now click MicroArray | Pattern | Hierarchical Clustering to open the Hierarchical Clustering window:
- Select the input -Omic data
- (Optional) Select a custom list of variables and/or observations to control the ordering in the heatmap
- Leave Compute observation tree selected, and leave Compute variable tree de-selected
- In Observation grouping, select a column from the Design table that is unique for each sample (e.g. Observation ID). This will force no clustering to occur on the observation table (even though "Compute observation tree" is selected).
Specify an output name, then click Submit.
Step 3: Mask Rows/Columns of interest
The default output heatmap will show all rows and columns. If you did not specify a customized order in the hierarchical clustering, you can sort heatmap rows by metadata columns.
(As a visual aid, the color bar along the Y-axis was added to the Annotation Table, containing a column of the first letter for each gene.)
Use the View Controller Variable and Observation tabs to filter out genes and samples of interest. For example, right-click on GeneID to filter out all genes from a list (e.g. all genes starting with M-Q) with a MINUS List Filter:
This will hide all genes in the list:
Step 4 (Optional): Highlight rows or columns
In addition to filtering rows/columns, certain rows/columns can be highlighted, by selecting those rows/columns in the Table View.
Step 4a: Select at least one column
Switch to the Table View, then select any columns that you want to be highlighted (at least one column must be selected, indicated by Green text):
Step 4b: Select rows of interest
Then, select any rows you want highlighted. For large numbers of rows, it will be easiest to use List's Select Rows by List:
In this example, a list containing Ensembl IDs for all genes starting with C-G or S-T was used to select all matching rows in the Table View.
After selection, only rows/columns that were selected (manually or by list) will have a blue highlight.
Advanced applications of this technique includes appending inference data to the annotation table, and filtering or selecting data, to highlight only the most significant results.