(662c) Analysis and Visualization of Multiparameter, Single-Cell Data Using Self-Organizing Maps | AIChE

(662c) Analysis and Visualization of Multiparameter, Single-Cell Data Using Self-Organizing Maps

Authors 

Kamei, K. I., Kyoto University
Sun, J., University of California, Los Angeles
Masterman-Smith, M., University of California, Los Angeles
Jiao, J., University of California, Los Angeles
Ohashi, M., University of California, Los Angeles
Tseng, H. R., University of California, Los Angeles
Graeber, T. G., University of California, Los Angeles

Introduction:  Technologies capable of multiple, quantitative, inexpensive, single-cell biomarker measurements (eg, microfluidics) are transforming biology by enabling systems analysis of microscopic samples at the single-cell level.  Because the large data sets generated by these single-cell technologies are highly complex, meaningful interpretation of these multiparameter data sets requires novel bioinformatic tools.  Here, we adapt self-organizing maps (SOMs), an unsupervised learning method which has found wide application in analysis of complex biological data sets (Tamayo et al, 1999), to analyze multiparameter, single-cell data sets including 1) proteomic measurements of signal transduction in clinical brain tumor specimens and 2) cytological features of human pluripotent stem cells.

Materials and Methods: To generate multiparameter, single-cell data sets, we utilized Microfluidic Image Cytometry (MIC), which combines the advantages of microfluidics and microscope-based cytometry to quantify multiple biomarkers at the single-cell level using immunocytochemistry (Sun et al, 2010).  For clinical brain tumor specimens, we simultaneously quantified four signaling proteins (EGFR, PTEN, phospho-Akt and phospho-S6).  For human embryonic (ES) and induced pluripotent stem cells (iPS), we quantified either protein biomarkers of pluripotency (OCT4, SSEA1) or 39 cytological features including nuclear morphology (DAPI) and cell cycle progression (incorporation of 5-ethynyl-2´-deoxyuridine, a marker of S-phase).  All data sets included ~1,000 cells per sample.

SOMs were created using the Kohonen R package (Wehrens & Buydens, 2007).  A SOM grid consists of a number of units each characterized by a multiparameter vector (eg, EGFR, PTEN, pAkt and pS6 levels).  Vectors characterizing the SOM grid are trained so as to represent the global measurement space.  Single-cell data are then mapped to the SOM grid based on similarity to the SOM units, and each sample is plotted as the frequency of cells in that sample that map to each SOM unit.  SOM grid frequencies of each sample SOM grid are finally subjected to unsupervised hierarchical clustering.

Results and Discussion:  To demonstrate the clinical application of SOMs in oncology, we used MIC to analyze the oncogenic PI3K/Akt/mTOR signaling pathway in a panel of 19 human brain tumor biopsies.  SOM analysis of the MIC data revealed a diversity of signaling patterns that would have been masked by population-average measurements, including evidence of inter- and intra-tumoral heterogeneity.  SOMs also stratified patients into molecularly-defined subsets that were predictive of tumor progression and patient survival.

To test the application of SOMs in stem cell biology, we measured either protein biomarkers of pluripotency (eg, OCT4 expression) or 39 cytological features (eg, nuclear morphology, cell cycle progression) in human ES and iPS cells.  SOMs identified phenotypic changes dynamically occuring during differentiation.  Additionally, in a panel of ES and iPS cells, SOMs revealed biomarker patterns that were predictive of the differentiated and pluripotent phenotypes.

Conclusions:  The successful utilization of single-cell, quantitative diagnostic technologies including microfluidics will require bioinformatic tools to facilitate meaningful analysis of multiparameter, single-cell data.  Here, we have demonstrated that SOMs are a robust, enabling tool for systems analysis of multiparameter, single-cell data.

References: Sun et al, Cancer Res (2010); 70:6128-38.

Tamayo et al, Proc Natl Acad Sci U S A (1999); 96: 2907-12.

Wehrens and Buydens, J Stat Soft (2007); 21: 1-19.

Topics