Single Cell Toolkit (SCTK) allows users to generate quality control (QC) metrics for the single-cell data that they have imported. Users should enter the page through the way as shown in the screenshot above.

Here SCTK provides generic QC metrics, as well as contamination estimation by decontX, and doublet detection algorithms including scDblFinder, cxds, bcds, cxds_bcds_hybrid, doubletFinder, and scrublet. Note that “scrublet” is Python based. Users should have it configured before trying to apply it.

Select which algorithms to run

Users must check all algorithms which they would like to run at once. Rerunning the QC procedure will overwrite previous QC results as well as all downstream analysis. The parameters for the corresponding algorithms will show up as users check it. Users may adjust the parameters as needed. Detailed explanation for each parameter can be accessed via the “Help” button in the parameter region of each algorithm, as shown below.

check algorithms

Other settings before running


After checking off all the desired QC algorithms, users should also go through other settings including:
- Select assay: Choose the expression matrix that should be used for calculation.
- Select variable containing sample labels: The cell level annotation that labels the sample source of each cell.


A UMAP embedding will be calculated in the end when running SCTK QC workflow. It will be used across all QC algorithms for visualization of any scores or clusters. The parameters are already described in the UI. For more information, please refer to getUMAP()

Having everything checked and ready, users can then press “Run” to start the procedure.

Plot the results

Once the QC algorithms have finished running, plots for the metrics from each algorithm will automatically appear to the right of the QC algorithms’ panel in a tab set manner.


The QC metrics are generated sample wise. For example, in the “counts per cell” violin plot above, the results for sample “pbmc3k” and “pbmc4k” are plotted separately.


While some types of plots are not able to be grouped, for certain algorithms, there will be sub-tabs for each sample if multiple samples are presented in users dataset.