A Stationary (and Therefore Compatible) Representation is All You Need

Niccolò Biondi · Federico Pernici · Simone Ricci · Alberto Del Bimbo

IEEE TPAMI 2026

When a model is updated, features learned with a d-Simplex fixed classifier stay stationary — new query features remain directly comparable to old gallery features, satisfying both compatibility inequalities in expectation.

TL;DR

1 sentence Training with a d-Simplex fixed classifier makes features provably backward-compatible — an updated model's queries can be matched against any old gallery without re-indexing.

Key Contributions

Stationarity → Compatibility (Theorem 1)

First proof — without approximation — that d-Simplex fixed classifiers satisfy both compatibility inequalities in expectation. Prior work only verified the same-class case; we close the gap using cosine distance in hyperspherical space.

Higher-Order Compatibility (HOC) Loss

ℒ_HOC = λ·ℒ_SCE + (1−λ)·ℒ_iNCE captures higher-order representation dependencies between updates and is provably equivalent to optimising cross-entropy under the compatibility constraints (Proposition 1).

New IAM-CL²R Benchmark

A realistic scenario where a fine-tuned model is periodically replaced by a stronger one — even a different architecture. The d-Simplex classifier matrix acts as a common interface, enabling seamless replacement without re-indexing.

State-of-the-Art Results

Outperforms 7 baselines on CIFAR100, TinyImageNet, CUB, and CelebA — the only method that maintains high compatibility through architecture changes across 31-task sequences and model replacements.

Method

d-Simplex Fixed Classifier

Class prototypes w₁, …, w_K are fixed at the vertices of the regular d-Simplex polytope — maximally separated, equiangular prototypes on the unit hypersphere. Because the classifier is frozen, features remain stationary across model updates: each class's hyperspherical cap shares the same central axis before and after fine-tuning, only shrinking as the model improves (Theorem 1).

HOC Loss for Sequential Fine-Tuning

When fine-tuning sequentially, cross-entropy with the d-Simplex aligns features at their first-order statistics only — the mean moves to the class prototype, but higher-order structure is ignored. This limits back-propagation and reduces compatibility. We address this with the Higher-Order Compatibility (HOC) loss:

The contrastive term ℒ_iNCE approximates the KL divergence between the joint and marginal distributions of φ_t and φ_t−1, thereby capturing mutual information (i.e., higher-order dependencies) between successive representations. By Proposition 1, minimising ℒ_HOC is equivalent to minimising ℒ_SCE subject to the compatibility constraints of Definition 1.

Results

CL²R Scenario — CIFAR100/10

Compatibility Matrices (CIFAR100/10, 7 tasks). Entries not satisfying compatibility are highlighted in red. d-Simplex-HOC achieves the most compatible entries and highest cross-test accuracy.

IAM-CL²R Scenario — Model Replacement

We introduce a new benchmark where the fine-tuned model is periodically swapped for a stronger one (retrained from scratch, or with a different architecture). Only d-Simplex-based methods benefit from replacements — all others degrade.

Results shown on CIFAR100/10 with 7 and 31 tasks. Extended experiments on TinyImageNet, CUB, and CelebA are reported in the full paper.

Average Accuracy up to task τ (AAτ) for CIFAR100R/10 with 31 tasks and two model replacements (boxed indices). Only d-Simplex-HOC and d-Simplex-FD improve after each replacement; all other methods degrade.

Citation

@article{biondi2024stationary_journal,
  title   = {A Stationary (and Therefore Compatible) Representation is All You Need},
  author  = {Biondi, Niccolò and Pernici, Federico and Ricci, Simone and Del Bimbo, Alberto},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year    = {2026},
  html    = {https://www.computer.org/csdl/journal/tp/5555/01/11515089/2gpcsTtMN3i}
}