GGE: A Standardized Framework for Evaluating Gene Expression Generative Models¶

Paper: Accepted at the Gen2 Workshop at ICLR 2026

Comprehensive, standardized evaluation of generated gene expression data.

Overview¶

GGE (Generated Genetic Expression Evaluator) addresses the urgent need for standardized evaluation in single-cell gene expression generative models. Current practices suffer from:

Inconsistent metric implementations
Incomparable hyperparameter choices
Lack of biologically-grounded metrics

GGE provides:

Comprehensive suite of distributional metrics with explicit computation space options (raw, PCA, DEG)
Biologically-motivated evaluation through DEG-focused analysis with perturbation-effect correlation
Standardized reporting for reproducible benchmarking
GPU (CUDA) and Apple MPS acceleration for efficient computation

Key Features¶

Per-Metric Space Configuration: Compute each metric in raw gene space, PCA space, or DEG space
Mixed-Space Evaluation: Use evaluate_lazy() with different spaces per metric
Perturbation-Effect Correlation: Paper Equation 1: ρ_effect = corr(μ_real - μ_ctrl, μ_gen - μ_ctrl)
Multiple Metrics: Pearson, Spearman, R², MSE, Wasserstein, MMD, Energy distance
Per-gene Analysis: All metrics computed per-gene with aggregation options
Condition Matching: Match samples by perturbation, cell type, or other metadata
Train/Test Splits: Evaluate on held-out data
Visualizations: Boxplots, violin plots, radar charts, scatter plots, embeddings, interactive Plotly
CLI & API: Use from command line or Python

Quick Installation¶

pip install gge-eval

Quick Example¶

from gge import evaluate

results = evaluate(
    real_path="real_data.h5ad",
    generated_path="generated_data.h5ad",
    condition_columns=["perturbation"],
    output_dir="output/"
)

print(results.summary())

Mixed-Space Evaluation (Paper API)¶

For maximum flexibility, configure each metric with its own computation space:

from gge import evaluate_lazy
from gge.metrics import PearsonCorrelation, Wasserstein2Distance, MMDDistance

metrics = [
    PearsonCorrelation(space="deg", deg_lfc=0.25, deg_pval=0.1),
    Wasserstein2Distance(space="pca", n_components=50),
    MMDDistance(space="pca", n_components=50),
]

results = evaluate_lazy(
    real_path="real.h5ad",
    generated_path="generated.h5ad",
    condition_columns="perturbation",
    control_key="ctrl",
    metrics=metrics,
)

Citation¶

If you use GGE in your research, please cite our paper:

@inproceedings{rubbi2026gge,
  title = {A Standardized Framework for Evaluating Gene Expression Generative Models},
  author = {Rubbi, Andrea and [CO-AUTHORS]},
  booktitle = {Gen2 Workshop at the International Conference on Learning Representations (ICLR)},
  year = {2026},
  note = {[PROCEEDINGS DETAILS TO BE ADDED]},
  url = {https://github.com/AndreaRubbi/GGE}
}

License¶

This project is licensed under the MIT License.

We would like to thank the contributors and the community for their support in developing this project.