A Stationary (and Therefore Compatible) Representation is All You Need

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

Niccolò Biondi ·  Federico Pernici ·  Simone Ricci ·  Alberto Del Bimbo
IEEE TPAMI 2026
When a model is updated, features learned with a d-Simplex fixed classifier stay stationary — new query features remain directly comparable to old gallery features, satisfying both compatibility inequalities in expectation.
TL;DR
1 sentence Training with a d-Simplex fixed classifier makes features provably backward-compatible — an updated model's queries can be matched against any old gallery without re-indexing.
~100 words We establish the first rigorous proof that stationarity implies backward compatibility — features learned with a d-Simplex fixed classifier satisfy both compatibility inequalities (Definition 1) without approximation. For sequential fine-tuning, plain cross-entropy only aligns first-order statistics, so we introduce the HOC loss: a convex combination of cross-entropy and a contrastive objective that captures higher-order representation dependencies while being provably equivalent to training under the compatibility constraints. Experiments across the CL²R and a new IAM-CL²R benchmark — where pre-trained models are periodically replaced by stronger ones — show state-of-the-art compatibility and accuracy across all datasets and task lengths.
Abstract Learning compatible representations aims to learn feature representations that can be used interchangeably over time whenever a model undergoes updates. In this paper, we demonstrate that stationary representations learned by d-Simplex fixed classifiers imply compatibility as in its formal definition. This result establishes a foundation for future works and can be directly exploited in practical learning scenarios. We address the challenge of learning compatibility using d-Simplex fixed classifiers when the model is sequentially fine-tuned. Learning according to a d-Simplex fixed classifier with the cross-entropy loss aligns feature distributions at the first-order statistics. Consequently, it may not fully capture higher-order dependencies in the representation between model updates. To address this issue, we demonstrate that training the model using a d-Simplex fixed classifier through a convex combination of the cross-entropy loss and a contrastive loss not only captures higher-order dependencies, but is also equivalent to learning with the cross-entropy under the compatibility constraints. We confirm our findings with extensive experiments also considering a new scenario where a pre-trained model is sequentially fine-tuned and occasionally replaced with an improved model. We show that stationary representations enable uninterrupted retrieval services (without reprocessing gallery images) while improving performance during model updates and replacements, achieving state-of-the-art.
Key Contributions
1
Stationarity → Compatibility (Theorem 1)

First proof — without approximation — that d-Simplex fixed classifiers satisfy both compatibility inequalities in expectation. Prior work only verified the same-class case; we close the gap using cosine distance in hyperspherical space.

2
Higher-Order Compatibility (HOC) Loss

HOC = λ·ℒSCE + (1−λ)·ℒiNCE captures higher-order representation dependencies between updates and is provably equivalent to optimising cross-entropy under the compatibility constraints (Proposition 1).

3
New IAM-CL²R Benchmark

A realistic scenario where a fine-tuned model is periodically replaced by a stronger one — even a different architecture. The d-Simplex classifier matrix acts as a common interface, enabling seamless replacement without re-indexing.

4
State-of-the-Art Results

Outperforms 7 baselines on CIFAR100, TinyImageNet, CUB, and CelebA — the only method that maintains high compatibility through architecture changes across 31-task sequences and model replacements.

Method
d-Simplex Fixed Classifier

Class prototypes w1, …, wK are fixed at the vertices of the regular d-Simplex polytope — maximally separated, equiangular prototypes on the unit hypersphere. Because the classifier is frozen, features remain stationary across model updates: each class's hyperspherical cap shares the same central axis before and after fine-tuning, only shrinking as the model improves (Theorem 1).

HOC Loss for Sequential Fine-Tuning

When fine-tuning sequentially, cross-entropy with the d-Simplex aligns features at their first-order statistics only — the mean moves to the class prototype, but higher-order structure is ignored. This limits back-propagation and reduces compatibility. We address this with the Higher-Order Compatibility (HOC) loss:

The contrastive term ℒiNCE approximates the KL divergence between the joint and marginal distributions of φt and φt−1, thereby capturing mutual information (i.e., higher-order dependencies) between successive representations. By Proposition 1, minimising ℒHOC is equivalent to minimising ℒSCE subject to the compatibility constraints of Definition 1.

Results
CL²R Scenario — CIFAR100/10
Compatibility Matrices (CIFAR100/10, 7 tasks). Entries not satisfying compatibility are highlighted in red. d-Simplex-HOC achieves the most compatible entries and highest cross-test accuracy.
IAM-CL²R Scenario — Model Replacement

We introduce a new benchmark where the fine-tuned model is periodically swapped for a stronger one (retrained from scratch, or with a different architecture). Only d-Simplex-based methods benefit from replacements — all others degrade.

Results shown on CIFAR100/10 with 7 and 31 tasks. Extended experiments on TinyImageNet, CUB, and CelebA are reported in the full paper.

Average Accuracy up to task τ (AAτ) for CIFAR100R/10 with 31 tasks and two model replacements (boxed indices). Only d-Simplex-HOC and d-Simplex-FD improve after each replacement; all other methods degrade.
Citation
@article{biondi2024stationary_journal,
  title   = {A Stationary (and Therefore Compatible) Representation is All You Need},
  author  = {Biondi, Niccolò and Pernici, Federico and Ricci, Simone and Del Bimbo, Alberto},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year    = {2026},
  html    = {https://www.computer.org/csdl/journal/tp/5555/01/11515089/2gpcsTtMN3i}
}