From Array Suite Wiki

(Redirected from Tsne v2)
Jump to: navigation, search


t-SNE V2 clustering


The Rtsne module in Array Studio will allow the user to cluster different cells with UMI counts, using the Rtsne package in R: T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation. t-SNE is a method for constructing a low dimensional embedding of high-dimensional data, distances or similarities. Nowadays, t-SNE has been a typical method to cluster different subgroup of cells in the process of analyzing Single Cell sequencing data. This function is intended to use Single Cell UMI count data, and directly runs the Rtsne in the R engine integrated with ArrayStudio.

If user haven't run Rtsne in ArrayStudio before and need to set it up, please follow this wiki: Setup tSNE in R engine to set the Rtsne up.

To open this module, please go to Analysis | NGS | Sing Cell RNA-Seq | t-SNE Clustering | t-SNE Clustering (V2) .

Tsnev2 01.png

[back to top]

Input Data Requirements

This module works on -Omic data objects and Zero inflated binary matrix (ZIM) data.

[back to top]

General Options

User can choose to perform this analysis locally:

Tsne v2 local.png

Or perform this analysis on the server:

Tsne v2 server.png

Note. the Perplexity value should be less than (observations -1 )/3.

[back to top]


  • Project & Data: The window includes a dropdown box to select the Project and Data object to be filtered.
  • Variables: Selections can be made on which variables should be included in the filtering (options include All variables, Selected variables, Visible variables, and Customized variables (select any pre-generated Lists)).
  • Observations: Selections can be made on which observations should be included in the filtering (options include All observations, Selected observations, Visible observations, and Customized observations (select any pre-generated Lists).
  • Output name: The user can choose to name the output data object.

[back to top]


  • Preprocessing:
    • Log2 Transformation: logical; by default Log2 Transformation will be checked
    • Filter raw data: logical; by default Filter raw data will be checked
    • Save filtered data: logical; by default save filter data will be checked;
    • UMI: logical; Users could choose UMI or not by their input data;
    • Log2 TPM cut off: numeric; a threshold corresponding to Log2 Transformation, it is set at 0.01 by default
    • Min observations per gene: numeric; threshold to filter outlier genes with few observation counts (default: 3)
    • Min genes per cell: integer; threshold of cells with minim gene counts (default: 200)
    • Check duplicates: logical; Checks whether duplicates are present. We generally assume that there is no duplicates. User can double check to see if duplicates present and set this option to FALSE, especially for large datasets. (default: FALSE).
  • PCA settings :
    • initial PCA dimensions: integer; the number of dimensions that should be retained in the initial PCA step (default: 50)
    • Center data before PCA: logical; Should data be centered before pca is applied? (default: TRUE)
    • Scale data before PCA: logical; Should data be scaled before pca is applied? (default: FALSE)
    • Partial PCA:
    • Run initial PCA: logical; Whether an initial PCA step should be performed (default: TRUE)

Warning.png WARNING: if user see the package compatibility is not OK, it means that the R integrated with ArrayStudio is not ready to run Rtsne, please check with R implementation of t-SNE to configure the Rtsne in ArrayStudio

[back to top]

Advanced Options

Tsne v2 advanced.png

  • tSNE
    • Dimesion: integer; Output dimensionality (default: 2)
    • Perplexity: numeric; Perplexity parameter
    • Theta: numeric; Speed/accuracy trade-off (increase for less accuracy), set to 0.0 for exact TSNE (default: 0.5)
    • Max iteration: integer; Number of iterations (default: 1000)
    • Kmean cluster number lower/upper bound: Indicate the minimal and maximal cell clusters. Once clustering is performed, cells will automatically be assigned according to kmeans identity
    • Stop lying iteration number: integer; Iteration after which the perplexities are no longer exaggerated (default: 250, except when Y_init is used, then 0)
    • Moment switch iteration number: integer; Iteration after which the final momentum is used (default: 250, except when Y_init is used, then 0)
    • Momentum: numeric; Momentum used in the first part of the optimization (default: 0.5)
    • Final Momentum: numeric; Momentum used in the final part of the optimization (default: 0.8)
    • Eta: numeric; Learning rate (default: 200.0)
    • Exaggeration factor: numeric; Exaggeration factor used to multiply the P matrix in the first part of the optimization (default: 12.0)
    • Set R Random Seed: binary then integer; Set seed for repeat;
    • Color by: drop down options based on factors in a column from design table;
  • Subset dataset by sample mapping file
    • Sample mapping file :
    • Sample metadata column :
    • Sample metadata column value:
    • Append sample metadata :

Output Results

The Rtsne module will generate a table and a scatter plot view for this table in ArrayStudio:


An example of TsneScoreTable is shown below:


An example of scatter plot with the two principle component defined by Rtsne is shown below. Each data point represents a cell:


Additional Options

Once the scatter plot is generated, user can try to manually select cells that belongs to the same cluster, and add a list name to these clusters:


If all of the cells have been assigned a list name based on their distribution in the scatter plot, user can select all the lists defined from this scatter plot and right click to choose to add the list membership to the original TsneScoreTable:



Then user can go to the scatter plot, and choose to Change Symbol Properties, and color the plot by Categorical value, and set the newly added ListMembership:


With this operation, user can see that different colors can be assigned to each cluster:


[back to top]



[back to top]

Related Articles


[back to top]