IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-58428-8.html
   My bibliography  Save this article

Evolutionary sparse learning reveals the shared genetic basis of convergent traits

Author

Listed:
  • John B. Allard

    (Temple University
    Temple University)

  • Sudip Sharma

    (Temple University
    Temple University)

  • Ravi Patel

    (Temple University
    Temple University)

  • Maxwell Sanderford

    (Temple University
    Temple University)

  • Koichiro Tamura

    (Tokyo Metropolitan University
    Tokyo Metropolitan University)

  • Slobodan Vucetic

    (Temple University)

  • Glenn S. Gerhard

    (Lewis Katz School of Medicine at Temple University)

  • Sudhir Kumar

    (Temple University
    Temple University)

Abstract

Cases abound in which nearly identical traits have appeared in distant species facing similar environments. These unmistakable examples of adaptive evolution offer opportunities to gain insight into their genetic origins and mechanisms through comparative analyses. Here, we present an approach to build genetic models that underlie the independent origins of convergent traits using evolutionary sparse learning with paired species contrast (ESL-PSC). We tested the hypothesis that common genes and sites are involved in the convergent evolution of two key traits: C4 photosynthesis in grasses and echolocation in mammals. Genetic models were highly predictive of independent cases of convergent evolution of C4 photosynthesis. Genes contributing to genetic models for echolocation were highly enriched for functional categories related to hearing, sound perception, and deafness, a pattern that has eluded previous efforts applying standard molecular evolutionary approaches. These results support the involvement of sequence substitutions at common genetic loci in the evolution of convergent traits. Benchmarking on empirical and simulated datasets showed that ESL-PSC could be more sensitive in proteome-scale analyses to detect genes with convergent molecular evolution associated with the acquisition of convergent traits. We conclude that phylogeny-informed machine learning naturally excludes apparent molecular convergences due to shared species history, enhances the signal-to-noise ratio for detecting molecular convergence, and empowers the discovery of common genetic bases of trait convergences.

Suggested Citation

  • John B. Allard & Sudip Sharma & Ravi Patel & Maxwell Sanderford & Koichiro Tamura & Slobodan Vucetic & Glenn S. Gerhard & Sudhir Kumar, 2025. "Evolutionary sparse learning reveals the shared genetic basis of convergent traits," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-58428-8
    DOI: 10.1038/s41467-025-58428-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-58428-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-58428-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Lukas Meier & Sara Van De Geer & Peter Bühlmann, 2008. "The group lasso for logistic regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 53-71, February.
    2. Joe Parker & Georgia Tsagkogeorga & James A. Cotton & Yuan Liu & Paolo Provero & Elia Stupka & Stephen J. Rossiter, 2013. "Genome-wide signatures of convergent evolution in echolocating mammals," Nature, Nature, vol. 502(7470), pages 228-231, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    2. repec:jss:jstsof:33:i01 is not listed on IDEAS
    3. Bilin Zeng & Xuerong Meggie Wen & Lixing Zhu, 2017. "A link-free sparse group variable selection method for single-index model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(13), pages 2388-2400, October.
    4. Olga Klopp & Marianna Pensky, 2013. "Sparse High-dimensional Varying Coefficient Model : Non-asymptotic Minimax Study," Working Papers 2013-30, Center for Research in Economics and Statistics.
    5. Li, Peili & Jiao, Yuling & Lu, Xiliang & Kang, Lican, 2022. "A data-driven line search rule for support recovery in high-dimensional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    6. Osamu Komori & Shinto Eguchi & John B. Copas, 2015. "Generalized t-statistic for two-group classification," Biometrics, The International Biometric Society, vol. 71(2), pages 404-416, June.
    7. Luu, Tung Duy & Fadili, Jalal & Chesneau, Christophe, 2019. "PAC-Bayesian risk bounds for group-analysis sparse regression by exponential weighting," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 209-233.
    8. Yang, Yanlin & Hu, Xuemei & Jiang, Huifeng, 2022. "Group penalized logistic regressions predict up and down trends for stock prices," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
    9. Lee, In Gyu & Yoon, Sang Won & Won, Daehan, 2022. "A Mixed Integer Linear Programming Support Vector Machine for Cost-Effective Group Feature Selection: Branch-Cut-and-Price Approach," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1055-1068.
    10. Ruidi Chen & Ioannis Ch. Paschalidis, 2022. "Robust Grouped Variable Selection Using Distributionally Robust Optimization," Journal of Optimization Theory and Applications, Springer, vol. 194(3), pages 1042-1071, September.
    11. Paul Blanc-Durand & Axel Van Der Gucht & Eric Guedj & Mukedaisi Abulizi & Mehdi Aoun-Sebaiti & Lionel Lerman & Antoine Verger & François-Jérôme Authier & Emmanuel Itti, 2017. "Cerebral 18F-FDG PET in macrophagic myofasciitis: An individual SVM-based approach," PLOS ONE, Public Library of Science, vol. 12(7), pages 1-11, July.
    12. Hamsa Bastani, 2021. "Predicting with Proxies: Transfer Learning in High Dimension," Management Science, INFORMS, vol. 67(5), pages 2964-2984, May.
    13. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    14. A. Karagrigoriou & C. Koukouvinos & K. Mylona, 2010. "On the advantages of the non-concave penalized likelihood model selection method with minimum prediction errors in large-scale medical studies," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(1), pages 13-24.
    15. Liu, Jianyu & Yu, Guan & Liu, Yufeng, 2019. "Graph-based sparse linear discriminant analysis for high-dimensional classification," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 250-269.
    16. Shota Yamanaka & Nobuo Yamashita, 2018. "Duality of nonconvex optimization with positively homogeneous functions," Computational Optimization and Applications, Springer, vol. 71(2), pages 435-456, November.
    17. Ciarleglio, Adam & Todd Ogden, R., 2016. "Wavelet-based scalar-on-function finite mixture regression models," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 86-96.
    18. Jenna Marie Reps & M Soledad Cepeda & Patrick B Ryan, 2020. "Wisdom of the CROUD: Development and validation of a patient-level prediction model for opioid use disorder using population-level claims data," PLOS ONE, Public Library of Science, vol. 15(2), pages 1-12, February.
    19. Lichun Wang & Yuan You & Heng Lian, 2015. "Convergence and sparsity of Lasso and group Lasso in high-dimensional generalized linear models," Statistical Papers, Springer, vol. 56(3), pages 819-828, August.
    20. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    21. Haowen Bao & Zongwu Cai & Yuying Sun & Shouyang Wang, 2023. "Penalized Model Averaging for High Dimensional Quantile Regressions," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202302, University of Kansas, Department of Economics.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-58428-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.