IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1011570.html
   My bibliography  Save this article

Detection of disease-specific signatures in B cell repertoires of lymphomas using machine learning

Author

Listed:
  • Paul Schmidt-Barbo
  • Gabriel Kalweit
  • Mehdi Naouar
  • Lisa Paschold
  • Edith Willscher
  • Christoph Schultheiß
  • Bruno Märkl
  • Stefan Dirnhofer
  • Alexandar Tzankov
  • Mascha Binder
  • Maria Kalweit

Abstract

The classification of B cell lymphomas—mainly based on light microscopy evaluation by a pathologist—requires many years of training. Since the B cell receptor (BCR) of the lymphoma clonotype and the microenvironmental immune architecture are important features discriminating different lymphoma subsets, we asked whether BCR repertoire next-generation sequencing (NGS) of lymphoma-infiltrated tissues in conjunction with machine learning algorithms could have diagnostic utility in the subclassification of these cancers. We trained a random forest and a linear classifier via logistic regression based on patterns of clonal distribution, VDJ gene usage and physico-chemical properties of the top-n most frequently represented clonotypes in the BCR repertoires of 620 paradigmatic lymphoma samples—nodular lymphocyte predominant B cell lymphoma (NLPBL), diffuse large B cell lymphoma (DLBCL) and chronic lymphocytic leukemia (CLL)—alongside with 291 control samples. With regard to DLBCL and CLL, the models demonstrated optimal performance when utilizing only the most prevalent clonotype for classification, while in NLPBL—that has a dominant background of non-malignant bystander cells—a broader array of clonotypes enhanced model accuracy. Surprisingly, the straightforward logistic regression model performed best in this seemingly complex classification problem, suggesting linear separability in our chosen dimensions. It achieved a weighted F1-score of 0.84 on a test cohort including 125 samples from all three lymphoma entities and 58 samples from healthy individuals. Together, we provide proof-of-concept that at least the 3 studied lymphoma entities can be differentiated from each other using BCR repertoire NGS on lymphoma-infiltrated tissues by a trained machine learning model.Author summary: Lymphoma, a complex group of malignant blood cancers, poses a significant diagnostic challenge due to its diverse subtypes. Yet, precise classification is crucial for tailored treatment. In our research, we developed a machine learning algorithm and conducted comprehensive validation to discern distinct B cell lymphoma subtypes. We therefore leveraged B cell repertoires of lymphoma-infiltrated tissue, as ascertained through next-generation sequencing. Our data offers three key insights: We detail the creation and training of our machine learning algorithm, explaining how we selected features and designed the model. We demonstrate the algorithm’s diagnostic precision using sequencing data from a test-set of patient samples. Moreover, through a deep dive into the most distinguishing aspects of our algorithm, we unveil distinctive disease-related patterns present within the malignant B cell and its surrounding environment. This analysis showed that both the malignant lymphoma cell, but also healthy bystander immune cells contribute to the distinctive architecture that characterizes a specific lymphoma subtype. We hope our work will contribute towards creating tools to diagnose lymphoma more easily and accurately ultimately leading to better outcomes for patients with this type of cancer.

Suggested Citation

  • Paul Schmidt-Barbo & Gabriel Kalweit & Mehdi Naouar & Lisa Paschold & Edith Willscher & Christoph Schultheiß & Bruno Märkl & Stefan Dirnhofer & Alexandar Tzankov & Mascha Binder & Maria Kalweit, 2024. "Detection of disease-specific signatures in B cell repertoires of lymphomas using machine learning," PLOS Computational Biology, Public Library of Science, vol. 20(7), pages 1-17, July.
  • Handle: RePEc:plo:pcbi00:1011570
    DOI: 10.1371/journal.pcbi.1011570
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011570
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011570&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1011570?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Dongguang Li & Jacob R. Bledsoe & Yu Zeng & Wei Liu & Yiguo Hu & Ke Bi & Aibin Liang & Shaoguang Li, 2020. "A deep learning diagnostic platform for diffuse large B-cell lymphoma with high accuracy across multiple hospitals," Nature Communications, Nature, vol. 11(1), pages 1-9, December.
    2. Maria Angela Gomes de Castro & Hanna Wildhagen & Shama Sograte-Idrissi & Christoffer Hitzing & Mascha Binder & Martin Trepel & Niklas Engels & Felipe Opazo, 2019. "Differential organization of tonic and chronic B cell antigen receptors in the plasma membrane," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Florian Märkl & Christoph Schultheiß & Murtaza Ali & Shih-Shih Chen & Marina Zintchenko & Lukas Egli & Juliane Mietz & Obinna Chijioke & Lisa Paschold & Sebastijan Spajic & Anne Holtermann & Janina Dö, 2024. "Mutation-specific CAR T cells as precision therapy for IGLV3-21R110 expressing high-risk chronic lymphocytic leukemia," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    2. Alexey Ferapontov & Marjan Omer & Isabelle Baudrexel & Jesper Sejrup Nielsen & Daniel Miotto Dupont & Kristian Juul-Madsen & Philipp Steen & Alexandra S. Eklund & Steffen Thiel & Thomas Vorup-Jensen &, 2023. "Antigen footprint governs activation of the B cell receptor," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    3. Sam Daly & João Ferreira Fernandes & Ezra Bruggeman & Anoushka Handa & Ruby Peters & Sarah Benaissa & Boya Zhang & Joseph S. Beckwith & Edward W. Sanders & Ruth R. Sims & David Klenerman & Simon J. Da, 2024. "High-density volumetric super-resolution microscopy," Nature Communications, Nature, vol. 15(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011570. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.