IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2509.03533.html
   My bibliography  Save this paper

Topic Identification in LLM Input-Output Pairs through the Lens of Information Bottleneck

Author

Listed:
  • Igor Halperin

Abstract

Large Language Models (LLMs) are prone to critical failure modes, including \textit{intrinsic faithfulness hallucinations} (also known as confabulations), where a response deviates semantically from the provided context. Frameworks designed to detect this, such as Semantic Divergence Metrics (SDM), rely on identifying latent topics shared between prompts and responses, typically by applying geometric clustering to their sentence embeddings. This creates a disconnect, as the topics are optimized for spatial proximity, not for the downstream information-theoretic analysis. In this paper, we bridge this gap by developing a principled topic identification method grounded in the Deterministic Information Bottleneck (DIB) for geometric clustering. Our key contribution is to transform the DIB method into a practical algorithm for high-dimensional data by substituting its intractable KL divergence term with a computationally efficient upper bound. The resulting method, which we dub UDIB, can be interpreted as an entropy-regularized and robustified version of K-means that inherently favors a parsimonious number of informative clusters. By applying UDIB to the joint clustering of LLM prompt and response embeddings, we generate a shared topic representation that is not merely spatially coherent but is fundamentally structured to be maximally informative about the prompt-response relationship. This provides a superior foundation for the SDM framework and offers a novel, more sensitive tool for detecting confabulations.

Suggested Citation

  • Igor Halperin, 2025. "Topic Identification in LLM Input-Output Pairs through the Lens of Information Bottleneck," Papers 2509.03533, arXiv.org.
  • Handle: RePEc:arx:papers:2509.03533
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2509.03533
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sebastian Farquhar & Jannik Kossen & Lorenz Kuhn & Yarin Gal, 2024. "Detecting hallucinations in large language models using semantic entropy," Nature, Nature, vol. 630(8017), pages 625-630, June.
    2. Igor Halperin, 2025. "Prompt-Response Semantic Divergence Metrics for Faithfulness Hallucination and Misalignment Detection in Large Language Models," Papers 2508.10192, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaoyi Luo & Aiwen Wang & Xinling Zhang & Kunda Huang & Songyu Wang & Lixin Chen & Yejia Cui, 2025. "Toward Intelligent AIoT: A Comprehensive Survey on Digital Twin and Multimodal Generative AI Integration," Mathematics, MDPI, vol. 13(21), pages 1-44, October.
    2. Abramson, Corey & Li, Zhuofan & Prendergast, Tara & Dohan, Daniel, 2025. "Qualitative Research in an Era of AI: A Pragmatic Approach to Data Analysis, Workflow, and Computation," SocArXiv 7bsgy_v1, Center for Open Science.
    3. Yusong Ke & Hongru Lin & Yuting Ruan & Junya Tang & Li Li, 2025. "Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework," Mathematics, MDPI, vol. 13(9), pages 1-17, May.
    4. Sheng Wang & Fangyuan Zhao & Dechao Bu & Yunwei Lu & Ming Gong & Hongjie Liu & Zhaohui Yang & Xiaoxi Zeng & Zhiyuan Yuan & Baoping Wan & Jingbo Sun & Yang Wu & Lianhe Zhao & Xirun Wan & Wei Huang & Ta, 2025. "LINS: A general medical Q&A framework for enhancing the quality and credibility of LLM-generated responses," Nature Communications, Nature, vol. 16(1), pages 1-20, December.
    5. Francesco Carli & Pierluigi Chiaro & Mariangela Morelli & Chakit Arora & Luisa Bisceglia & Natalia Oliveira Rosa & Alice Cortesi & Sara Franceschi & Francesca Lessi & Anna Luisa Stefano & Orazio Santo, 2025. "Learning and actioning general principles of cancer cell drug sensitivity," Nature Communications, Nature, vol. 16(1), pages 1-23, December.
    6. Li, Butong & Zhu, Junjie & Zhao, Xufeng, 2025. "A prior knowledge-guided predictive framework for LCF life and its implementation in shaft-like components under multiaxial loading," Reliability Engineering and System Safety, Elsevier, vol. 260(C).
    7. Gemma Turon & Mwila Mulubwa & Anna Montaner & Mathew Njoroge & Kelly Chibale & Miquel Duran-Frigola, 2025. "Artificial intelligence coupled to pharmacometrics modelling to tailor malaria and tuberculosis treatment in Africa," Nature Communications, Nature, vol. 16(1), pages 1-12, December.
    8. Igor Halperin, 2025. "Prompt-Response Semantic Divergence Metrics for Faithfulness Hallucination and Misalignment Detection in Large Language Models," Papers 2508.10192, arXiv.org.
    9. Zhou, Zhen & Gu, Ziyuan & Qu, Xiaobo & Liu, Pan & Liu, Zhiyuan & Yu, Wenwu, 2024. "Urban mobility foundation model: A literature review and hierarchical perspective," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 192(C).
    10. Xiaohan Lin & Yijie Xia & Yanheng Li & Yu-Peng Huang & Shuo Liu & Jun Zhang & Yi Qin Gao, 2025. "In-silico 3D molecular editing through physics-informed and preference-aligned generative foundation models," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    11. Sanchaita Hazra & Marta Serra-Garcia, 2025. "Understanding Trust in AI as an Information Source: Cross-Country Evidence," CESifo Working Paper Series 11954, CESifo.
    12. Beining Xu & Yongming Lu, 2025. "TECP: Token-Entropy Conformal Prediction for LLMs," Mathematics, MDPI, vol. 13(20), pages 1-14, October.
    13. Peng Zhang & Jiayu Shi & Maged N. Kamel Boulos, 2024. "Generative AI in Medicine and Healthcare: Moving Beyond the ‘Peak of Inflated Expectations’," Future Internet, MDPI, vol. 16(12), pages 1-21, December.
    14. Hui-Hung Yu & Wei-Tsun Lin & Chih-Wei Kuan & Chao-Chi Yang & Kuan-Min Liao, 2025. "GraphRAG-Enhanced Dialogue Engine for Domain-Specific Question Answering: A Case Study on the Civil IoT Taiwan Platform," Future Internet, MDPI, vol. 17(9), pages 1-22, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2509.03533. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.