Exploring the uncertainty in genomic sequence-to-activity models and how AI sometimes gets DNA predictions wrong
Imagine being able to read DNA like a blueprint, predicting exactly how each gene will behave—and what happens when that blueprint is misinterpreted. This is the challenge scientists face with genomic sequence-to-activity models, sophisticated artificial intelligence systems designed to predict how DNA sequences control gene activity. These models represent a revolutionary tool for understanding the genetic syntax that dictates everything from our eye color to our susceptibility to diseases.
Researchers have discovered that AI models display remarkable overconfidence when analyzing standard reference genomes, yet become surprisingly uncertain when predicting the effects of genetic variations between individuals 1 3 .
Models show high confidence on reference sequences even when their predictions are incorrect.
Models become uncertain when predicting effects of genetic variations between individuals.
Genomic sequence-to-activity models are deep learning systems—complex algorithms inspired by the human brain—that take DNA sequences as input and predict various molecular outputs.
How actively a gene is being used
Where proteins that control genes attach to DNA
How open or closed different DNA regions are
Chemical tags that influence gene activity
Despite their sophistication, these models show a troubling pattern: they make excellent predictions for standard reference genomes but struggle with the genetic variations that make each of us unique 1 3 .
As Ayesha Bajwa and colleagues noted in their 2024 study, "Models tend to make high-confidence predictions on reference sequences, even when incorrect, and low-confidence predictions on sequences with variants" 3 .
To investigate this uncertainty problem, researchers from Stanford University and other institutions conducted a clever experiment using an ensemble of Basenji2 models—a representative state-of-the-art architecture for genomic prediction 1 .
Rather than relying on a single model, they trained five replicates of the Basenji2 model, each with the same architecture and training data but differing only in their random initial parameters and the random sampling of training examples 1 .
The results revealed striking patterns in when these models agree—and when they don't.
High model consistency with median correlation scores exceeding 0.9
Models agree and match experimental data on reference sequences
Model replicates made inconsistent predictions for eQTLs and personal genomes
| Sequence Type | Model Consistency |
|---|---|
| Reference Genome | High (median correlation >0.9) |
| eQTLs | Low (>50% inconsistent predictions) |
| Personal Genomes | Low (>50% inconsistent predictions) |
| Assay Type | Median Consistency |
|---|---|
| CAGE | Slightly lower |
| DNase/ATAC-seq | High |
| TF ChIP-seq | High |
| Histone ChIP-seq | High |
Essential Resources for Genomic Prediction Research
Deep learning architecture for genomic prediction
Large collections of functional genomics profiles
Multiple models with different random seeds
Statistical approach for combining predictions
The discovery of systematic uncertainty patterns in genomic prediction models has far-reaching implications for both basic research and medical applications.
The inconsistency in predicting variant effects suggests we should be cautious when interpreting individual results from these models 1 3 .
Rather than taking any single prediction at face value, researchers can use ensemble approaches to gauge confidence levels, similar to how the scientific team did in this study 1 3 .
This uncertainty characterization also points toward concrete strategies for improvement. Training models on more diverse genetic sequences—including data from multiple species—has already shown promise in boosting performance 6 .
The journey to perfectly decode genomic regulation continues, but acknowledging the confidence gap represents crucial progress.
By understanding where these powerful models fail—and where they succeed—we can use them more wisely.
The path forward isn't about eliminating uncertainty but about mapping its contours—knowing when to trust our genomic AI guides, and when to proceed with caution.