IDEAS home Printed from https://ideas.repec.org/p/upf/upfgen/1915.html
   My bibliography  Save this paper

Interpretable Kernels

Author

Listed:
  • Patrick J.F. Groenen
  • Michael Greenacre

Abstract

The use of kernels for nonlinear prediction is widespread in machine learning. They have been popularized in support vector machines and used in kernel ridge regression, amongst others. Kernel methods share three aspects. First, instead of the original matrix of predictor variables or features, each observation is mapped into an enlarged feature space. Second, a ridge penalty term is used to shrink the coefficients on the features in the enlarged feature space. Third, the solution is not obtained in this enlarged feature space, but through solving a dual problem in the observation space. A major drawback in the present use of kernels is that the interpretation in terms of the original features is lost. In this paper, we argue that in the case of a wide matrix of features, where there are more features than observations, the kernel solution can be re-expressed in terms of a linear combination of the original matrix of features and a ridge penalty that involves a special metric. Consequently, the exact same predicted values can be obtained as a weighted linear combination of the features in the usual manner and thus can be interpreted. In the case where the number of features is less than the number of observations, we discuss a least-squares approximation of the kernel matrix that still allows the interpretation in terms of a linear combination. It is shown that these results hold for any function of a linear combination that minimizes the coefficients and has a ridge penalty on these coefficients, such as in kernel logistic regression and kernel Poisson regression. This work makes a contribution to interpretable artificial intelligence.

Suggested Citation

  • Patrick J.F. Groenen & Michael Greenacre, 2025. "Interpretable Kernels," Economics Working Papers 1915, Department of Economics and Business, Universitat Pompeu Fabra.
  • Handle: RePEc:upf:upfgen:1915
    as

    Download full text from publisher

    File URL: https://econ-papers.upf.edu/papers/1915.pdf
    File Function: Whole Paper
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    2. Lorin Crawford & Kris C. Wood & Xiang Zhou & Sayan Mukherjee, 2018. "Bayesian Approximate Kernel Regression With Variable Selection," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(524), pages 1710-1721, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Winn-Nuñez, Emily T. & Griffin, Maryclare & Crawford, Lorin, 2024. "A simple approach for local and global variable importance in nonlinear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    2. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    3. Rui Wang & Naihua Xiu & Kim-Chuan Toh, 2021. "Subspace quadratic regularization method for group sparse multinomial logistic regression," Computational Optimization and Applications, Springer, vol. 79(3), pages 531-559, July.
    4. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    5. Chen, Le-Yu & Lee, Sokbae, 2018. "Best subset binary prediction," Journal of Econometrics, Elsevier, vol. 206(1), pages 39-56.
    6. Chuliá, Helena & Garrón, Ignacio & Uribe, Jorge M., 2024. "Daily growth at risk: Financial or real drivers? The answer is not always the same," International Journal of Forecasting, Elsevier, vol. 40(2), pages 762-776.
    7. Sung Jae Jun & Sokbae Lee, 2024. "Causal Inference Under Outcome-Based Sampling with Monotonicity Assumptions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(3), pages 998-1009, July.
    8. Xiangwei Li & Thomas Delerue & Ben Schöttker & Bernd Holleczek & Eva Grill & Annette Peters & Melanie Waldenberger & Barbara Thorand & Hermann Brenner, 2022. "Derivation and validation of an epigenetic frailty risk score in population-based cohorts of older adults," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    9. Christopher J Greenwood & George J Youssef & Primrose Letcher & Jacqui A Macdonald & Lauryn J Hagg & Ann Sanson & Jenn Mcintosh & Delyse M Hutchinson & John W Toumbourou & Matthew Fuller-Tyszkiewicz &, 2020. "A comparison of penalised regression methods for informing the selection of predictive markers," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
    10. Mai Li & Ying Lin & Qianmei Feng & Wenjiang Fu & Shenglin Peng & Siwei Chen & Mahesh Paidpilli & Chirag Goel & Eduard Galstyan & Venkat Selvamanickam, 2025. "Quantile regression-enriched event modeling framework for dropout analysis in high-temperature superconductor manufacturing," Journal of Intelligent Manufacturing, Springer, vol. 36(5), pages 3009-3030, June.
    11. Heng Chen & Daniel F. Heitjan, 2022. "Analysis of local sensitivity to nonignorability with missing outcomes and predictors," Biometrics, The International Biometric Society, vol. 78(4), pages 1342-1352, December.
    12. S Ariane Christie & Amanda S Conroy & Rachael A Callcut & Alan E Hubbard & Mitchell J Cohen, 2019. "Dynamic multi-outcome prediction after injury: Applying adaptive machine learning for precision medicine in trauma," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-13, April.
    13. Zhu Wang, 2022. "MM for penalized estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 54-75, March.
    14. Ida Kubiszewski & Kenneth Mulder & Diane Jarvis & Robert Costanza, 2022. "Toward better measurement of sustainable development and wellbeing: A small number of SDG indicators reliably predict life satisfaction," Sustainable Development, John Wiley & Sons, Ltd., vol. 30(1), pages 139-148, February.
    15. Gustavo A. Alonso-Silverio & Víctor Francisco-García & Iris P. Guzmán-Guzmán & Elías Ventura-Molina & Antonio Alarcón-Paredes, 2021. "Toward Non-Invasive Estimation of Blood Glucose Concentration: A Comparative Performance," Mathematics, MDPI, vol. 9(20), pages 1-13, October.
    16. Christopher Kath & Florian Ziel, 2018. "The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecasts," Papers 1811.08604, arXiv.org.
    17. Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).
    18. Gurgul Henryk & Machno Artur, 2017. "Trade Pattern on Warsaw Stock Exchange and Prediction of Number of Trades," Statistics in Transition New Series, Statistics Poland, vol. 18(1), pages 91-114, March.
    19. Ahmed Ismaïl & Hartikainen Anna-Liisa & Järvelin Marjo-Riitta & Richardson Sylvia, 2011. "False Discovery Rate Estimation for Stability Selection: Application to Genome-Wide Association Studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-20, November.
    20. Vitaly Meursault & Daniel Moulton & Larry Santucci & Nathan Schor, 2025. "One threshold doesn't fit all: Tailoring machine learning predictions of consumer default for lower‐income areas," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 44(3), pages 792-815, June.

    More about this item

    JEL classification:

    • C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
    • C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:upf:upfgen:1915. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge The email address of this maintainer does not seem to be valid anymore. Please ask the person in charge to update the entry or send us the correct address (email available below). General contact details of provider: http://www.upf.edu/en/web/econ/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.