For our insights reports, we utilize the framework laid out by Keren et al in their paper “A Structured Tumor-Immune Microenvironment in Triple Negative Breast Cancer Revealed by Multiplexed Ion Beam Imaging”. In this paper, tumor samples were classified into three groups:

  1. “Excluded” tumors, with very little or no immune infiltrate
  2. “Infiltrated” tumors with more immune infiltrate, which are further subdivided into two classes:
    1. “Mixed” tumors, with high levels of interaction between immune and tumor cells
    2. “Compartmentalized” tumors with immune cells present, but not infiltrating as thoroughly into contact with tumor cells

To identify these groupings, a threshold is first drawn on the percentage of immune cells across all samples, identifying which tumors are “Excluded” or “Infiltrated”.

Then, the tumors classified as “Infiltrated” are further subdivided as either “Mixed” or “Compartmentalized” by examining the average number of contacts between an immune cell and a tumor cell. For each sample, a mixing score will be calculated as the ratio of homotypic interactions between immune cells and heterotypic interactions between immune and tumor cells as $\#\ of\ immune::tumor\ interactions/\#\ of\ immune::immune\ interactions$. A higher mixing score indicates more interactions between immune and tumor cells, and therefore a more “Mixed” sample. A threshold is then drawn on this mixing score depending on the dataset at hand, classifying all “Infiltrated” tumors as either “Mixed” or “Compartmentalized”.

If a metadata trait specifying cohort definitions is provided, we will report the result of a $\chi^2$ test to evaluate whether your cohorts have different proportions of each type of tumor structure.

If multiple immune phenotypes, or biomarker positivity annotations are provided, we will report the result of statistical testing comparing the frequencies of these cell phenotypes between the tumor structure classifications. The exact tests run will depend on the number of samples that end up being classified as excluded, compartmentalized, and mixed.

A dataset should consist of at least 10 independent biological samples to have a chance of statistical significance after samples are classified.