Modern single-cell models secretly know a lot about how cells respond to genes, drugs, and disease — even when they were never trained to predict these things.
This post (and our paper) explains how simple decoder gradients reveal this hidden knowledge.
Introduction
Single-cell RNA-seq gives us snapshots of thousands or millions of individual cells. Many labs now use generative models (often variational autoencoders like scVI) to compress these snapshots into a low-dimensional latent space.
Such models are normally used for denoising or batch correction — but they also learn something deeper: They learn how gene expression changes when a cell is perturbed. Surprisingly, they learn this even without ever seeing perturbation labels.
Our study asks a simple question:
Can we extract perturbation effects directly from a pretrained decoder?
The answer is yes — by reading off gradients.
The core idea: follow the gradient
A decoder maps a latent point z to gene expression predictions. If we take the gradient of a gene’s expression with respect to z, we obtain a direction of change:
- positive direction → increases expression
- negative direction → decreases expression
This gradient acts like a tiny perturbation simulator. Concretely, we can simulate a perturbation step by following the gradient:
where is the step size and is the gradient of gene .
Think of the decoder outputs as “gene expression landscapes.” Gradients tell us which way the landscapes slope. Following a slope step-by-step traces how a cell would move if we:
- turn a gene up (overexpression),
- turn it down (knockdown),
- increase probability of injury,
- or move forward in developmental time.
This process requires no modification to the model architecture and works with any differentiable decoder. When decoding a perturbed cell, other genes change along with it — more realistically than zeroing out a gene.
Gradient ascent on the expression landscape
Probing a large pretrained CELL × GENE scVI model
We probed a frozen scVI decoder trained on the CELL × GENE Discover Census (M cells). Focusing on pancreatic islet cells, we analyzed type 2 diabetes mellitus (T2D) without any fine-tuning.
The model correctly identifies that increasing Ins1 (insulin) expression drives the latent representation from the diabetic region toward the normal region.
Ins1 gradients overlaid on PCA of islet latents; gradients show the direction of increasing expression
Beta cells: Increasing Ins1 aligns with T2D → healthy
Alpha cells: Increasing Gcg aligns with healthy → T2D
Thus, we see that increasing Ins1 in β-cells moves latent representations from the diabetic region toward the normal region, and increasing Gcg in α-cells has the opposite effect, consistent with its role in increasing blood glucose.
Other relevant endocrine and metabolic genes (Pcsk1, Pcsk2, Acot7, Fabp5, Mdh1, Aldoa) show flows matching known T2D biology. But how do we score which genes are most relevant for a disease?
Ranking genes by their alignment with a disease axis
To quantify gene relevance for a disease, we sampled gradients for all genes in the model. We define a latent healthy → disease axis from group means,
which represents the average displacement between the two conditions. Each gene receives a score based on the average cosine similarity between its gradient field and this axis:
Here, quantifies a directional agreement: larger values indicate that gradients align with the healthy disease transition. As we have a large degree of freedom in sampling , one may compute the score for, e.g., purely healthy or purely disease samples. The range of covers gradients pointing in the reverse direction (), orthogonally (), or the same direction ().
As a baseline, we decode the negative binomial means at the median latent of each condition to obtain per-gene and , and score genes by the symmetric change: .
Pathway analysis with an LLM-in-the-loop
Upon gene enrichment, we select the top 200 genes with the largest absolute scores. These gene sets are analyzed with WebGestalt overrepresentation analysis using pathways from WikiPathways, which results in a list of biological systems which are overrepresented in the gene set. We mechanistically interpret these pathways with an LLM agent. Specifically, we run the following prompt three times for each pathway, using GPT-5 with reasoning and web-search enabled:
Prompt 1. You have an expert perspective in bioinformatics. Is [PATHWAY] highly relevant for type 2 diabetes mellitus in Mus musculus? Answer with Yes or No. Afterwards, describe shortly your explanation for whether the pathway involves type 2 diabetes, providing references for your claims.
We find pathway interpretations from this stage to already be highly accurate. To combat non-determinism, we re-run the same prompt thrice and feed answers into Prompt 2:
Prompt 2. You have an expert perspective in bioinformatics. Your task is to very concisely judge whether a pathway is relevant for type 2 diabetes mellitus (T2D) in Mus musculus. When asked whether [PATHWAY] is highly relevant for T2D in Mus musculus, these were your answers from three distinct runs:
Answer 1: [ANSWER 1]
Answer 2: [ANSWER 2]
Answer 3: [ANSWER 3]Now give your final critical verdict with a Yes or No, and describe very concisely your explanation (with a few sentences at most), using correct scientific references.
For transparency, all our LLM responses are collected in a json file, which is used to label pathways for their suggested relevance in Mus musculus T2D:
Overrepresentation analysis based on WikiPathways. Pathways are labeled by false discovery rate (FDR) and LLM-inferred relevance. When scoring genes, gradients are sampled either 'at healthy' or 'at disease' data.
While the baseline did not identify any pathways at FDR ≤ 0.05, the gradient-based method located multiple significant pathways. With a mechanistic analysis by LLM AI agents, we found that enriched pathways are more relevant for T2D than pathways from the baseline.
Predicting toxin response and temporal dynamics
This framework extends beyond individual genes. By attaching lightweight auxiliary heads for specific tasks — such as classification or regression — we can compute gradients for arbitrary concepts like “injury” or “developmental time.”
We validated this on cardiotoxin-induced muscle injury and C. elegans embryogenesis (worm embryos during early development).
Gradients of cardiotoxin injury probability
Gradients of developmental time
In both cases, the gradient flows align with ground-truth biological transitions, moving from control to injured states, or correctly tracing the lineage time course.
Takeaway
Our results confirm that pretrained generative models implicitly encode disease axes, regulatory logic, and perturbation knowledge. Without fine-tuning on specific disease data, the model can provide a ranked list of relevant genes for a disease and perform infinitesimal perturbations.
Why does this work?
Generative models must learn how genes co-vary across millions of cells. Those co-variations encode:
- regulatory structure
- developmental structure
- stress responses
- disease axes
The decoder gradient extracts directions for this knowledge with a single line of automatic differentiation.
In short:
Trained single-cell models can be trated as virtual labs, simulating how cells respond to drugs, gene knockouts, or disease — without need for retraining or perturbation labels. Such models already encode rich, unsupervised knowledge about gene regulation and disease. We just need to read the gradients.
BibTeX
This page may be cited as:
@article{bjerregaard2025single,
title={What do single-cell models already know about perturbations?},
author={Bjerregaard, Andreas and Prada-Luengo, I{\~n}igo and Das, Vivek and Krogh, Anders},
journal={Genes},
volume={16},
number={12},
pages={1439},
year={2025},
publisher={Multidisciplinary Digital Publishing Institute}
}