A Stationary (and Therefore Compatible) Representation is All You Need
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026
Stationarity → Compatibility (Theorem 1)
First proof — without approximation — that d-Simplex fixed classifiers satisfy both compatibility inequalities in expectation. Prior work only verified the same-class case; we close the gap using cosine distance in hyperspherical space.
Higher-Order Compatibility (HOC) Loss
ℒHOC = λ·ℒSCE + (1−λ)·ℒiNCE captures higher-order representation dependencies between updates and is provably equivalent to optimising cross-entropy under the compatibility constraints (Proposition 1).
New IAM-CL²R Benchmark
A realistic scenario where a fine-tuned model is periodically replaced by a stronger one — even a different architecture. The d-Simplex classifier matrix acts as a common interface, enabling seamless replacement without re-indexing.
State-of-the-Art Results
Outperforms 7 baselines on CIFAR100, TinyImageNet, CUB, and CelebA — the only method that maintains high compatibility through architecture changes across 31-task sequences and model replacements.
d-Simplex Fixed Classifier
Class prototypes w1, …, wK are fixed at the vertices of the regular d-Simplex polytope — maximally separated, equiangular prototypes on the unit hypersphere. Because the classifier is frozen, features remain stationary across model updates: each class's hyperspherical cap shares the same central axis before and after fine-tuning, only shrinking as the model improves (Theorem 1).
HOC Loss for Sequential Fine-Tuning
When fine-tuning sequentially, cross-entropy with the d-Simplex aligns features at their first-order statistics only — the mean moves to the class prototype, but higher-order structure is ignored. This limits back-propagation and reduces compatibility. We address this with the Higher-Order Compatibility (HOC) loss:
The contrastive term ℒiNCE approximates the KL divergence between the joint and marginal distributions of φt and φt−1, thereby capturing mutual information (i.e., higher-order dependencies) between successive representations. By Proposition 1, minimising ℒHOC is equivalent to minimising ℒSCE subject to the compatibility constraints of Definition 1.
CL²R Scenario — CIFAR100/10
IAM-CL²R Scenario — Model Replacement
We introduce a new benchmark where the fine-tuned model is periodically swapped for a stronger one (retrained from scratch, or with a different architecture). Only d-Simplex-based methods benefit from replacements — all others degrade.
Results shown on CIFAR100/10 with 7 and 31 tasks. Extended experiments on TinyImageNet, CUB, and CelebA are reported in the full paper.
@article{biondi2024stationary_journal,
title = {A Stationary (and Therefore Compatible) Representation is All You Need},
author = {Biondi, Niccolò and Pernici, Federico and Ricci, Simone and Del Bimbo, Alberto},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
year = {2026},
html = {https://www.computer.org/csdl/journal/tp/5555/01/11515089/2gpcsTtMN3i}
}