Introduction

Feature Selection sub-tab in Feature Selection & Dimensionality Reduction tab offers a convenient way to compute and select the most variable features that show the highest biological variability to use them in the downstream analysis. The tab offers multiple methods (specified below) to compute the variable genes, which can be visualized in the comparison to the remaining features through a plot on right side.

General Workflow

A general workflow for the Feature Selection sub-tab is summarized in the figure below:

Workflow Guide

  • Select Feature Selection & Dimensionality Reduction tab from the top menu. This workflow guide assumes that the data as been previously uploaded, filtered and normalized before proceeding with this tab.

  • Select Feature Selection sub-tab (selected by default) to open up the feature selection user-interface.

  • The Feature Selection sub-tab is divided into three panels namely, a) Compute HVG, b) Select and Subset, and c) Plot.

The working of sections a, b and c are described below:

a) Compute HVG

The Compute HVG window allows the processing of highly variable genes by selecting an appropriate method either from Seurat (vst, mean.var.plot, dispersion) or Scran (modelGeneVar) packages.

A numeric value indicating the number of features to identify must be set (default is 2000) and an assay must be selected from the list of available assays.

b) Select and Subset

Once the highly variable genes have been computed in (a), subset of these features can be selected for downstream analysis. A numeric value (default is 100) can be input to set the number of genes that should be displayed in (b), labeled (highlighted in red) in the plot (c) and selected for further analysis in the succeeding tabs as a subset.

c) Plot

A plot is generated against the HVG computation in (a) and the number of genes to display set in (b) are labeled over corresponding data points in the plot.