The Unsupervised Clustering Extension allows for the creation of new cell annotation sets based on unsupervised clustering techniques - specifically, Louvain or Leiden clustering. This serves as an alternative cell phenotyping option to the Gating extension.

This extension also allows for the analysis and visualization of the clustering run output via UMAP and heatmap plots, as well as the labeling of the output clusters.

To navigate to the Unsupervised Clustering Extension, select the extension in the Analysis Toolkit tab.

Screenshot 2023-08-07 at 1.07.14 PM.png

Unsupervised Clustering Overview

The first page is the Unsupervised Clustering Overview page. Select a study using the dropdown at the top of the page and view an overview of any previous or current clustering runs that have been launched using the extension.

Screenshot 2023-08-07 at 1.08.03 PM.png

From this page, you can either Launch a new Unsupervised Clustering Run or view Unsupervised Clustering Run Details.

Launching an Unsupervised Clustering Run

Screenshot 2023-08-07 at 1.08.54 PM.png

To launch a new unsupervised clustering run, click the “New Run” button on the overview page. This will open a modal with several fields that must be specified:

Field Description
Clustering Run Name The name of your clustering run.
Description A text description of your clustering run. Parameters for clustering will be automatically recorded, so this field is intended for a high level summary.
Clustering Approach Leiden or Louvain
# Neighbors See scanpy.pp.neighbors documentation for details.
UMAP Min Distance See scanpy.tl.umap documentation for details.
Clustering Resolution See scanpy.tl.louvain or scanpy.tl.leiden documentation for details (louvain, leiden).
Biomarker Expression Version The expression version that your clustering run will be based on.
Quality Control Filter A quality control filter that should be applied to filter the cells prior to clustering. This option will only be visible if there is an active QC filter for the biomarker expression version you have selected.
Regions The regions you would like to include in this clustering run. This will only show regions that are included in the selected biomarker expression version and QC filter.
Clustering Biomarkers The biomarkers you would like to use in clustering. These are limited to biomarkers that are present in all clustering Regions, and excludes biomarkers that appear multiple times in a region’s panel.

When selecting Regions, you will have the option to select from all regions in the study, or “All Regions”. You must select at least 1 region. You can click the filter icon on the left side of this dropdown to filter the regions that are shown based on metadata traits specified in the designer.

Screenshot 2023-08-10 at 1.19.14 PM.png

You can then continue by staging regions with their own configuration properties. In the Unsupervised Clustering extension, the Biomarker Expression Version and Quality Filter can be set and staged at the region level. This allows you to specify the appropriate expression version and QC Filter for each respective region. Once you have all of the staged configurations you are interested in, click Select Biomarkers to continue.

The Clustering Biomarkers dropdown will automatically update to include all biomarkers that are shared across the staged regions. You must select at least 1 biomarker. You can click the filter icon on the left side of this dropdown to filter the list of biomarkers shown based on the biomarker grades given in the visualizer.

Screenshot 2023-08-07 at 1.12.45 PM.png

Once the form has been filled out, click Submit to kick off the run. Once the run has completed, a notification will be sent and the results will appear in the Unsupervised Clustering Run Details page. This generally takes around 30 minutes to 1 hour.

Note: An automatic normalization step occurs before running any analysis.