IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2502.13438.html
   My bibliography  Save this paper

Balancing Flexibility and Interpretability: A Conditional Linear Model Estimation via Random Forest

Author

Listed:
  • Ricardo Masini
  • Marcelo Medeiros

Abstract

Traditional parametric econometric models often rely on rigid functional forms, while nonparametric techniques, despite their flexibility, frequently lack interpretability. This paper proposes a parsimonious alternative by modeling the outcome $Y$ as a linear function of a vector of variables of interest $\boldsymbol{X}$, conditional on additional covariates $\boldsymbol{Z}$. Specifically, the conditional expectation is expressed as $\mathbb{E}[Y|\boldsymbol{X},\boldsymbol{Z}]=\boldsymbol{X}^{T}\boldsymbol{\beta}(\boldsymbol{Z})$, where $\boldsymbol{\beta}(\cdot)$ is an unknown Lipschitz-continuous function. We introduce an adaptation of the Random Forest (RF) algorithm to estimate this model, balancing the flexibility of machine learning methods with the interpretability of traditional linear models. This approach addresses a key challenge in applied econometrics by accommodating heterogeneity in the relationship between covariates and outcomes. Furthermore, the heterogeneous partial effects of $\boldsymbol{X}$ on $Y$ are represented by $\boldsymbol{\beta}(\cdot)$ and can be directly estimated using our proposed method. Our framework effectively unifies established parametric and nonparametric models, including varying-coefficient, switching regression, and additive models. We provide theoretical guarantees, such as pointwise and $L^p$-norm rates of convergence for the estimator, and establish a pointwise central limit theorem through subsampling, aiding inference on the function $\boldsymbol\beta(\cdot)$. We present Monte Carlo simulation results to assess the finite-sample performance of the method.

Suggested Citation

  • Ricardo Masini & Marcelo Medeiros, 2025. "Balancing Flexibility and Interpretability: A Conditional Linear Model Estimation via Random Forest," Papers 2502.13438, arXiv.org.
  • Handle: RePEc:arx:papers:2502.13438
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2502.13438
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Iván Fernández‐Val & Ye Luo, 2018. "The Sorted Effects Method: Discovering Heterogeneous Effects Beyond Their Averages," Econometrica, Econometric Society, vol. 86(6), pages 1911-1938, November.
    2. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    3. Mayte Suarez -Farinas & Carlos E. Pedreira & Marcelo C. Medeiros, 2004. "Local Global Neural Networks: A New Approach for Nonlinear Time Series Modeling," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1092-1107, December.
    4. K. S. Chan & H. Tong, 1986. "On Estimating Thresholds In Autoregressive Models," Journal of Time Series Analysis, Wiley Blackwell, vol. 7(3), pages 179-190, May.
    5. Stéphane Bonhomme & Elena Manresa, 2015. "Grouped Patterns of Heterogeneity in Panel Data," Econometrica, Econometric Society, vol. 83(3), pages 1147-1184, May.
    6. Kiefer, Nicholas M, 1978. "Discrete Parameter Variation: Efficient Estimation of a Switching Regression Model," Econometrica, Econometric Society, vol. 46(2), pages 427-434, March.
    7. Cai, Zongwu & Fan, Jianqing & Yao, Qiwei, 2000. "Functional-coefficient regression models for nonlinear time series," LSE Research Online Documents on Economics 6314, London School of Economics and Political Science, LSE Library.
    8. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    9. Areosa, Waldyr Dutra & McAleer, Michael & Medeiros, Marcelo C., 2011. "Moment-based estimation of smooth transition regression models with endogenous variables," Journal of Econometrics, Elsevier, vol. 165(1), pages 100-111.
    10. Cribari-Neto, Francisco & Garcia, Nancy Lopes & Vasconcellos, Klaus L. P., 2000. "A Note on Inverse Moments of Binomial Variates," Brazilian Review of Econometrics, Sociedade Brasileira de Econometria - SBE, vol. 20(2), November.
    11. Susan Athey & Guido W. Imbens, 2019. "Machine Learning Methods That Economists Should Know About," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 685-725, August.
    12. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    13. Bruce E. Hansen, 2000. "Sample Splitting and Threshold Estimation," Econometrica, Econometric Society, vol. 68(3), pages 575-604, May.
    14. Dagenais, Marcel G, 1969. "A Threshold Regression Model," Econometrica, Econometric Society, vol. 37(2), pages 193-203, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.
    2. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
    3. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    4. Cockx, Bart & Lechner, Michael & Bollens, Joost, 2023. "Priority to unemployed immigrants? A causal machine learning evaluation of training in Belgium," Labour Economics, Elsevier, vol. 80(C).
    5. Koo, Chao, 2018. "Essays on functional coefficient models," Other publications TiSEM ba87b8a5-3c55-40ec-967d-9, Tilburg University, School of Economics and Management.
    6. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    7. Labib Shami & Teddy Lazebnik, 2024. "Implementing Machine Learning Methods in Estimating the Size of the Non-observed Economy," Computational Economics, Springer;Society for Computational Economics, vol. 63(4), pages 1459-1476, April.
    8. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    9. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Carlos Fern'andez-Lor'ia & Foster Provost & Jesse Anderton & Benjamin Carterette & Praveen Chandar, 2020. "A Comparison of Methods for Treatment Assignment with an Application to Playlist Generation," Papers 2004.11532, arXiv.org, revised Apr 2022.
    11. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    12. Zhang, Han, 2021. "How Using Machine Learning Classification as a Variable in Regression Leads to Attenuation Bias and What to Do About It," SocArXiv 453jk, Center for Open Science.
    13. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    14. Yigit Aydede & Jan Ditzen, 2022. "Identifying the regional drivers of influenza-like illness in Nova Scotia with dominance analysis," Papers 2212.06684, arXiv.org.
    15. Daniel Boller & Michael Lechner & Gabriel Okasa, 2021. "The Effect of Sport in Online Dating: Evidence from Causal Machine Learning," Papers 2104.04601, arXiv.org.
    16. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    17. Aysegül Kayaoglu & Ghassan Baliki & Tilman Brück & Melodie Al Daccache & Dorothee Weiffen, 2023. "How to conduct impact evaluations in humanitarian and conflict settings," HiCN Working Papers 387, Households in Conflict Network.
    18. Bas Bosma & Arjen Witteloostuijn, 2024. "Machine learning in international business," Journal of International Business Studies, Palgrave Macmillan;Academy of International Business, vol. 55(6), pages 676-702, August.
    19. Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.
    20. Verhagen, Mark D., 2023. "Using machine learning to study effect heterogeneity in large-scale policy interventions: The Dutch decentralisation of the Social Domain," SocArXiv qzm7y_v1, Center for Open Science.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2502.13438. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.