Nat Commun

Visual analysis of mass cytometry data by hierarchical stochastic neighbour embedding reveals rare cell types

Vincent van Unen, Thomas Höllt, Nicola Pezzotti, Na Li, Marcel Reinders, Elmar Eisemann, Anna Vilanova, Frits Koning, and Boudewijn P. F. Lelieveldt

Analysis of the CD4+ T-cell compartment in inflammatory intestinal diseases. a Third HSNE level embedding of the CD4+ T cells (1.4 × 106 cells, selected in Fig. 3). Color and size of landmarks as described in Fig. 3. Right panel shows density features for the level 3 embedding. Blue encirclement indicates selection of landmarks representing CD28−CD4+ T cells. b Embedding of the CD28−CD4+ T cells (2.6 × 104 cells) at single-cell resolution. Bottom-left panel shows yellow and black dashed encirclements based on CD56− and CD56+ expression, respectively. Three bottom-right panels show cells colored according to: (left) from subjects with different disease status (CeD, Crohn, EATLII, RCDII, and controls), (middle) sampling status (annotated subset, discarded by ACCENSE and downsampled) and (right) tissue-of-origin (blood and intestine)

Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a stepwise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.


More Information

Gallery

Citation

Vincent van Unen, Thomas Höllt, Nicola Pezzotti, Na Li, Marcel Reinders, Elmar Eisemann, Anna Vilanova, et al., Visual analysis of mass cytometry data by hierarchical stochastic neighbour embedding reveals rare cell types, Nat Commun, 8, p. 1740, 2017.

BibTex

@article{bib:van unen:2017,
    author       = { van Unen, Vincent and Höllt, Thomas and Pezzotti, Nicola and Li, Na and Reinders, Marcel and Eisemann, Elmar and Vilanova, Anna and Koning, Frits and Lelieveldt, Boudewijn P. F. },    
    title        = { Visual analysis of mass cytometry data by hierarchical stochastic neighbour embedding reveals rare cell types },
    journal      = { Nat Commun },
    volume       = { 8 },
    year         = { 2017 },
    pages        = { 1740 },
    doi          = { 10.1038/s41467-017-01689-9 },
    pubmedid     = { 29170529 },
    url          = { https://publications.graphics.tudelft.nl/papers/251 },
}