Introduction
Spatial imaging of single cells and their protein markers in tumor tissues provides crucial insights into tumor-microenvironment interactions. While cellular neighborhood analysis is key to understanding these mechanisms, current approaches rely on predefined cell types or neighborhood-wide marker aggregations, which sacrifice valuable single-cell resolution data.
Method
We developed Cellohood, an innovative AI tool utilizing a permutation-invariant, transformer-based autoencoder designed for cellular neighborhood modeling. The system compresses information about individual cells and their marker expression within local environments, delivering both neighborhood-level representations and single-cell profiles. From these representations, we derived novel cellular neighborhood prototypes characterized by cell types, protein markers, and spatial arrangements.
The model's architecture enables flexible multi-resolution analysis, allowing researchers to investigate both broad tissue organization as well as detailed cellular interactions. As a result, patients can be effectively described through the abundance and arrangements of cellular neighborhood prototypes.
Results
We demonstrated Cellohood's capabilities across multiple spatial imaging technology datasets:
- Mouse spleen lupus CODEX data,
- DLPFC Prefrontal Cortex Visium 10x data,
- Breast cancer IMC data from Jackson et al.,
- NSCLC and TNBC IMC data from the Immucan consortium.
Our analysis revealed that patient representations generated by Cellohood correlate meaningfully with disease stages, types/histologies, and patient prognosis. In cancer datasets, coarse-resolution analysis uncovered distinct whole-slide tumor infiltration patterns, while high-resolution examination revealed specific neighborhood interactions significantly linked to clinical outcomes. Spatial analysis further demonstrated that the neighborhood representation successfully encodes diverse tissue architectures.
Conclusions
Thanks to AI-powered transformer-based generative modeling, Cellohood is the first model to utilize complete cell marker information during training without resorting to coarse, neighborhood-wide approximation. Results on multiple datasets obtained using different technologies demonstrated that Cellohood enables marker-driven discovery of cellular-microenvironment interactions and their clinical implications.
?