AI Model Predicts Cell Impact on Disease Outcomes

Institute of Science Tokyo

A computational method called scSurv, developed by researchers at Institute of Science Tokyo, links individual cells to patient outcomes using widely available bulk RNA sequencing data. The approach uses single-cell reference datasets together with patient survival data to infer the contributions of individual cells within complex tissues. The model identified cell populations associated with survival across several cancers, offering a way to uncover disease-driving cells and support the development of more targeted treatment strategies.

What if scientists could identify the exact cells responsible for driving a disease? In a tumor, for instance, there are thousands of individual cells, each playing a unique role in driving disease progression or resisting therapy. Identifying which cells promote disease and which help counter it could help guide future treatment strategies.

Such analysis at the single-cell level is now becoming possible with advances in single-cell sequencing technologies, which allow researchers to measure the gene expression of individual cells and gain insights into their behavior and function. By linking this cellular information to patient outcomes, researchers can begin to understand how individual cells influence disease progression.

However, datasets that combine single-cell information with clinical outcomes are still relatively limited. In contrast, large amounts of bulk ribonucleic acid (RNA) sequencing data from tissues containing a diverse range of cells are widely available.

Researchers from Institute of Science Tokyo (Science Tokyo), Japan, have developed a computational method that uses single-cell RNA sequencing data as a reference to estimate the proportions and contributions of individual cells from bulk RNA sequencing data and determine how they may influence patient outcomes. Their model, called scSurv, could help guide more personalized treatment strategies.

The method was made available online on December 22, 2025, and was published in Volume 42, Issue 1 of the journal Bioinformatics on January 13, 2026, and its implementation is also available as an open-source Python package on GitHub and Zenodo .

The research group was led by Professor Teppei Shimamura and graduate student Chikara Mizukoshi from the Department of Computational and Systems Biology, Division of Biological Data Science, Medical Research Laboratory, Institute for Integrated Research, Science Tokyo, together with Dr. Yasuhiro Kojima, Head of the Laboratory of Computational Life Science, National Cancer Center Research Institute, Japan (and an affiliated researcher at the same department in Science Tokyo).

"We present the first methodology to quantify individual cells' contributions to clinical outcomes. The method identifies prognostically relevant cell populations and associated genes, with potential applications in therapeutic target discovery and biomarker identification, thereby providing a foundation for precision medicine leveraging existing bulk RNA sequencing and clinical datasets," says Prof. Shimamura.

scSurv uses single-cell RNA sequencing data as a reference to deconvolute RNA sequencing data from bulk samples and estimate the proportions of latent cell states, which are groups of cells with similar gene expression patterns, present in each sample. The contributions of these cell states are then linked to patient outcomes using an extended Cox proportional hazards model that considers patient survival data. This survival analysis method estimates how strongly each cell state contributes to clinical risk. The model then maps these risk contributions back to individual cells belonging to those states to infer their influence on patient outcomes.

Once trained, the model was able to estimate the contributions of more than 10,000 individual cells to disease risk and prognosis. It can also identify genes associated with disease progression and map different regions within tissues according to their potential clinical risk.

Using data from The Cancer Genome Atlas, the model successfully predicted patient survival across multiple cancers, including patients whose data were not used during training. The method also identified individual cells linked to patient outcomes in melanoma and detected immune cells called macrophages that are known to be associated with different survival outcomes. The researchers were also able to map the risk of tumor tissue affected by renal cell carcinoma, a type of kidney cancer, revealing regions associated with higher or lower risk. The researchers also tested the approach using infectious disease datasets, highlighting its versatility for studying diseases beyond cancer.

"These findings suggest that scSurv may contribute to more advanced clinical outcome analysis and to the discovery of therapeutic targets," says Prof. Shimamura.

By examining the contributions of individual cells to disease, researchers can gain a better understanding of disease mechanisms at the cellular level, ultimately supporting the development of more precise diagnostic tools and personalized treatments.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.