Author
Listed:
- Donghyeon Kim
(Department of Defense Acquisition Program, Kwangwoon University, Seoul 01897, Republic of Korea
These authors contributed equally to this work.)
- Chae-Bong Sohn
(Department of Electronics and Communications Engineering, Kwangwoon University, Seoul 01897, Republic of Korea
These authors contributed equally to this work.)
- Do-Yup Kim
(Department of Information and Telecommunication Engineering, Incheon National University, Incheon 22012, Republic of Korea)
- Dae-Yeol Kim
(Department of Artificial Intelligence, Kyungnam University, Changwon 51767, Republic of Korea)
Abstract
Unsupervised representation learning has emerged as a promising paradigm in machine learning, owing to its capacity to extract semantically meaningful features from unlabeled data. Despite recent progress, however, such methods remain vulnerable to collapse phenomena, wherein the expressiveness and diversity of learned representations are severely degraded. This phenomenon poses significant challenges to both model performance and generalizability. This paper presents a systematic investigation into two distinct forms of collapse: complete collapse and dimensional collapse. Complete collapse typically arises in non-contrastive frameworks, where all learned representations converge to trivial constants, thereby rendering the learned feature space non-informative. While contrastive learning has been introduced as a principled remedy, recent empirical findings indicate that it falls to prevent collapse entirely. In particular, contrastive methods are still susceptible to dimensional collapse, where representations are confined to a narrow subspace, thus restricting both the information content and effective dimensionality. To address these concerns, we conduct a comprehensive literature analysis encompassing theoretical definitions, underlying causes, and mitigation strategies for each collapse type. We further categorize recent approaches to collapse prevention, including feature decorrelation techniques, eigenvalue distribution regularization, and batch-level statistical constraints, and assess their effectiveness through a comparative framework. This work aims to establish a unified conceptual foundation for understanding collapse in unsupervised learning and to guide the design of more robust representation learning algorithms.
Suggested Citation
Donghyeon Kim & Chae-Bong Sohn & Do-Yup Kim & Dae-Yeol Kim, 2025.
"A Taxonomy and Theoretical Analysis of Collapse Phenomena in Unsupervised Representation Learning,"
Mathematics, MDPI, vol. 13(18), pages 1-28, September.
Handle:
RePEc:gam:jmathe:v:13:y:2025:i:18:p:2986-:d:1750256
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:18:p:2986-:d:1750256. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.