From Array Suite Wiki
SingleCell RNA-seq QC Metrics
Similar to the RNA-Seq QC Metrics module, which provides the aligned QC statistics on sample level, this SCRNA-Seq QC Metrics module will run multiple QC commands simultaneously, and provide the aligned QC statistics on cell level.
To access this module, please go to Analysis | NGS | Single Cell RNA-Seq | SC RNA-Seq QC Metrics
Input Data Requirements
This module requires Bam file resulted from Barcoded alignment with Single Cell fastq file as input files.
- Genome - Select the same Genome used for the alignment for these bam files.
- Gene model - Select the same Gene model used for mapping the sequences.
- Job number - The total number of jobs to run in parallel.
- Exclude duplicate alignments: Only one read in a set of duplicated aligned reads will be counted. Duplicates are identified by the duplication flag.
- Exclude failed alignments - Reads with flag "ReadIsFailed" will be excluded. See also FLAG
- Exclude secondary alignments - Only the primary read alignment will be used in the QC metrics (OSA randomly flag one of the ties as primary while others as secondary).
- Exclude multi-reads (ZC tag required) - Multi reads are considered non-unique (i.e. reads that align to multiple genomic locations with equal or similar numbers of mismatches). Selecting this option will include unique reads only when performing the SNP summarization.
- Exclude singletons (paired end required) - Will not count reads where both pairs did not map to the same region.
- QC Metrics - Select the desired metrics to appear in the resulting report.
- Alignment: Overall quality of mapping, such as unique mapping rate, paired, rate, etc.
- Flag: Summarizes SAM flags on reads.
- Profile: Mapping rate to different genomic features, such as exons, introns, junctions, and insertions/deletions.
- Source: What category of transcripts were mapped, such as protein-coding genes, ncRNAs, pseudogenes, etc.
- InsertSize: Distribution of fragment insert sizes for paired-end experiments. The inferred distribution should reflect library-preparation size selection.
- Duplication:Duplication level, based on genomic/transcriptomic coordinates.
- Coverage: How many genes were detected as being transcribed in the sequencing reaction, at different expression cutoffs.
- Strand: Orientation of mapped reads, relative to gene transcription orientation.
- Feature: Count of CDS, Exon, Gene, and Transcript coverage.
- Output name: The user can choose to name the output data object.
- Output folder: An output folder can be specified for the report.
By default, there will be four features generated in the report table, and they are filtered by the criteria we used for land data filtering:
- Alignment_Mapped >=1000
- Alignment_MappedRate >=0.3
- Source_MitochondrialRate <=0.2
- Coverage_GeneWithCoverage >=250
User can move the interested feature from the left box to the right box, and the according features will be added to the output table.
If user check the option for Export full single cell QC matrix, a huge matrix table will be generated including all of the available features by design. User can check with the Output section for more details.
An SC RNA Seq QC Metrics Table will be generated. Different from the RNASeq QC metrics report for bulk RNASeq, this SC RNA Seq QC Metrics Table will have cell name as the row ID, and different alignment statistics as the column name. Other than that, all of the content will be same to the RNA Seq QC Metrics Table. Additional information on SC RNA-seq QC metrics can be found here.
By default, there will be four features generated in the report table, and they are filtered by the criteria we mentioned above:
If user have checked the option for Export full single cell QC matrix, a huge matrix table will be generated including all of the available features by design:
Besides the QCMatrix table, there will be also QCMatrixPlot associated, to show the scatter plot for each column in the table: