IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2604.07744.html

The Condition-Number Principle for Prototype Clustering

Author

Listed:
  • Romano Li
  • Jianfei Cao

Abstract

We develop a geometric framework that links objective accuracy to structural recovery in prototype-based clustering. The analysis is algorithm-agnostic and applies to a broad class of admissible loss functions. We define a clustering condition number that compares within-cluster scale to the minimum loss increase required to move a point across a cluster boundary. When this quantity is small, any solution with a small suboptimality gap must also have a small misclassification error relative to a benchmark partition. The framework also clarifies a fundamental trade-off between robustness and sensitivity to cluster imbalance, leading to sharp phase transitions for exact recovery under different objectives. The guarantees are deterministic and non-asymptotic, and they separate the role of algorithmic accuracy from the intrinsic geometric difficulty of the instance. We further show that errors concentrate near cluster boundaries and that sufficiently deep cluster cores are recovered exactly under strengthened local margins. Together, these results provide a geometric principle for interpreting low objective values as reliable evidence of meaningful clustering structure.

Suggested Citation

  • Romano Li & Jianfei Cao, 2026. "The Condition-Number Principle for Prototype Clustering," Papers 2604.07744, arXiv.org.
  • Handle: RePEc:arx:papers:2604.07744
    as

    Download full text from publisher

    File URL: https://arxiv.org/pdf/2604.07744
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jianfei Cao & Christian Hansen & Damian Kozbur & Lucciano Villacorta, 2025. "Inference for Dependent Data with Learned Clusters," The Review of Economics and Statistics, MIT Press, vol. 107(6), pages 1684-1701, November.
    2. Michael P. Leung, 2023. "Network Cluster‐Robust Inference," Econometrica, Econometric Society, vol. 91(2), pages 641-667, March.
    3. Lucy L. Gao & Jacob Bien & Daniela Witten, 2024. "Selective Inference for Hierarchical Clustering," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 119(545), pages 332-342, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Oguzhan Akgun & Alain Pirotte & Giovanni Urga & Zhenlin Yang, 2025. "Testing Clustered Equal Predictive Ability with Unknown Clusters," Papers 2507.14621, arXiv.org, revised Jul 2025.
    2. Christis Katsouris, 2023. "Optimal Estimation Methodologies for Panel Data Regression Models," Papers 2311.03471, arXiv.org, revised Nov 2023.
    3. Shahbaz, Muhammad & Eti, Serkan & Yüksel, Serhat & Dinçer, Hasan & Çırak, Ayşe Nur, 2026. "A multi-criteria decision-making framework for enhancing renewable energy productivity," Renewable Energy, Elsevier, vol. 258(C).
    4. Zihan Zhang & Lianyan Fu & Dehui Wang, 2026. "Difference-in-Differences using Double Negative Controls and Graph Neural Networks for Unmeasured Network Confounding," Papers 2601.00603, arXiv.org.
    5. Oguzhan Akgun & Ryo Okui, 2025. "Robust Inference Methods for Latent Group Panel Models under Possible Group Non-Separation," Papers 2511.18550, arXiv.org.
    6. Davide Viviano & Lihua Lei & Guido Imbens & Brian Karrer & Okke Schrijvers & Liang Shi, 2023. "Causal clustering: design of cluster experiments under network interference," Papers 2310.14983, arXiv.org, revised Jun 2026.
    7. Ruoxuan Xiong & Alex Chin & Sean J. Taylor, 2024. "Data-Driven Switchback Experiments: Theoretical Tradeoffs and Empirical Bayes Designs," Papers 2406.06768, arXiv.org.
    8. Laila Messaoudi, 2025. "Cluster analysis for ethical portfolio optimization problem using fuzzy chance constrained programming," Environmental Economics and Policy Studies, Springer;Society for Environmental Economics and Policy Studies - SEEPS, vol. 27(4), pages 705-726, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2604.07744. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: https://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.