IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0299600.html
   My bibliography  Save this article

Machine learning evaluation for identification of M-proteins in human serum

Author

Listed:
  • Alexandros Sopasakis
  • Maria Nilsson
  • Mattias Askenmo
  • Fredrik Nyholm
  • Lillemor Mattsson Hultén
  • Victoria Rotter Sopasakis

Abstract

Serum electrophoresis (SPEP) is a method used to analyze the distribution of the most important proteins in the blood. The major clinical question is the presence of monoclonal fraction(s) of antibodies (M-protein/paraprotein), which is essential for the diagnosis and follow-up of hematological diseases, such as multiple myeloma. Recent studies have shown that machine learning can be used to assess protein electrophoresis by, for example, examining protein glycan patterns to follow up tumor surgery. In this study we compared 26 different decision tree algorithms to identify the presence of M-proteins in human serum by using numerical data from serum protein capillary electrophoresis. For the automated detection and clustering of data, we used an anonymized data set consisting of 67,073 samples. We found five methods with superior ability to detect M-proteins: Extra Trees (ET), Random Forest (RF), Histogram Grading Boosting Regressor (HGBR), Light Gradient Boosting Method (LGBM), and Extreme Gradient Boosting (XGB). Additionally, we implemented a game theoretic approach to disclose which features in the data set that were indicative of the resulting M-protein diagnosis. The results verified the gamma globulin fraction and part of the beta globulin fraction as the most important features of the electrophoresis analysis, thereby further strengthening the reliability of our approach. Finally, we tested the algorithms for classifying the M-protein isotypes, where ET and XGB showed the best performance out of the five algorithms tested. Our results show that serum capillary electrophoresis combined with decision tree algorithms have great potential in the application of rapid and accurate identification of M-proteins. Moreover, these methods would be applicable for a variety of blood analyses, such as hemoglobinopathies, indicating a wide-range diagnostic use. However, for M-protein isotype classification, combining machine learning solutions for numerical data from capillary electrophoresis with gel electrophoresis image data would be most advantageous.

Suggested Citation

  • Alexandros Sopasakis & Maria Nilsson & Mattias Askenmo & Fredrik Nyholm & Lillemor Mattsson Hultén & Victoria Rotter Sopasakis, 2024. "Machine learning evaluation for identification of M-proteins in human serum," PLOS ONE, Public Library of Science, vol. 19(4), pages 1-13, April.
  • Handle: RePEc:plo:pone00:0299600
    DOI: 10.1371/journal.pone.0299600
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0299600
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0299600&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0299600?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0299600. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.