IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2504.09663.html
   My bibliography  Save this paper

Ordinary Least Squares as an Attention Mechanism

Author

Listed:
  • Philippe Goulet Coulombe

Abstract

I show that ordinary least squares (OLS) predictions can be rewritten as the output of a restricted attention module, akin to those forming the backbone of large language models. This connection offers an alternative perspective on attention beyond the conventional information retrieval framework, making it more accessible to researchers and analysts with a background in traditional statistics. It falls into place when OLS is framed as a similarity-based method in a transformed regressor space, distinct from the standard view based on partial correlations. In fact, the OLS solution can be recast as the outcome of an alternative problem: minimizing squared prediction errors by optimizing the embedding space in which training and test vectors are compared via inner products. Rather than estimating coefficients directly, we equivalently learn optimal encoding and decoding operations for predictors. From this vantage point, OLS maps naturally onto the query-key-value structure of attention mechanisms. Building on this foundation, I discuss key elements of Transformer-style attention and draw connections to classic ideas from time series econometrics.

Suggested Citation

  • Philippe Goulet Coulombe, 2025. "Ordinary Least Squares as an Attention Mechanism," Papers 2504.09663, arXiv.org.
  • Handle: RePEc:arx:papers:2504.09663
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2504.09663
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Eric Fischer & Rebecca McCaughrin & Saketh Prazad & Mark Vandergon, 2023. "Fed Transparency and Policy Expectation Errors: A Text Analysis Approach," Staff Reports 1081, Federal Reserve Bank of New York.
    2. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    3. Aruoba, Boragan & Drechsel, Thomas, 2022. "Identifying Monetary Policy Shocks: A Natural Language Approach," CEPR Discussion Papers 17133, C.E.P.R. Discussion Papers.
    4. Jens Ludwig & Sendhil Mullainathan & Ashesh Rambachan, 2024. "Large Language Models: An Applied Econometric Framework," Papers 2412.07031, arXiv.org, revised Jan 2025.
    5. Philippe Goulet Coulombe, 2024. "The macroeconomy as a random forest," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(3), pages 401-421, April.
    6. Harvey,Andrew C., 1991. "Forecasting, Structural Time Series Models and the Kalman Filter," Cambridge Books, Cambridge University Press, number 9780521405737, June.
    7. Bryan T. Kelly & Boris Kuznetsov & Semyon Malamud & Teng Andrea Xu, 2025. "Artificial Intelligence Asset Pricing Models," NBER Working Papers 33351, National Bureau of Economic Research, Inc.
    8. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    9. Philippe Goulet Coulombe, 2022. "A Neural Phillips Curve and a Deep Output Gap," Working Papers 22-01, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management.
    10. Bryan T. Kelly & Boris Kuznetsov & Semyon Malamud & Teng Andrea Xu, 2025. "Artificial Intelligence Asset Pricing Models," Swiss Finance Institute Research Paper Series 25-08, Swiss Finance Institute.
    11. James D. Hamilton, 2018. "Why You Should Never Use the Hodrick-Prescott Filter," The Review of Economics and Statistics, MIT Press, vol. 100(5), pages 831-843, December.
    12. Philippe Goulet Coulombe & Maximilian Goebel & Karin Klieber, 2024. "Dual Interpretation of Machine Learning Forecasts," Papers 2412.13076, arXiv.org.
    13. Byeungchun Kwon & Taejin Park & Fernando Perez-Cruz & Phurichai Rungcharoenkitkul, 2024. "Large language models: a primer for economists," BIS Quarterly Review, Bank for International Settlements, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Borup & Philippe Goulet Coulombe & Erik Christian Montes Schütte & David E. Rapach & Sander Schwenk-Nebbe, 2022. "The Anatomy of Out-of-Sample Forecasting Accuracy," FRB Atlanta Working Paper 2022-16, Federal Reserve Bank of Atlanta.
    2. Philippe Goulet Coulombe & Maximilian Gobel, 2023. "Maximally Machine-Learnable Portfolios," Working Papers 23-01, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Apr 2023.
    3. Huang, Dashan & Li, Jiangyuan & Wang, Liyao, 2021. "Are disagreements agreeable? Evidence from information aggregation," Journal of Financial Economics, Elsevier, vol. 141(1), pages 83-101.
    4. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    5. Philippe Goulet Coulombe & Karin Klieber, 2025. "An Adaptive Moving Average for Macroeconomic Monitoring," Papers 2501.13222, arXiv.org.
    6. Michalski, Lachlan & Low, Rand Kwong Yew, 2024. "Determinants of corporate credit ratings: Does ESG matter?," International Review of Financial Analysis, Elsevier, vol. 94(C).
    7. Shuangshuang Fan & Yichao Li & William Mbanyele & Xiufeng Lai, 2025. "Determinants and Pathways for Inclusive Growth in China: Investigation Based on Artificial Intelligence (AI) Algorithm," Computational Economics, Springer;Society for Computational Economics, vol. 65(3), pages 1231-1264, March.
    8. Andrii Babii & Ryan T. Ball & Eric Ghysels & Jonas Striaukas, 2024. "Panel data nowcasting: The case of price–earnings ratios," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(2), pages 292-307, March.
    9. Bakalli, Gaetan & Guerrier, Stéphane & Scaillet, Olivier, 2023. "A penalized two-pass regression to predict stock returns with time-varying risk premia," Journal of Econometrics, Elsevier, vol. 237(2).
    10. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.
    11. Wang, Yudong & Hao, Xianfeng, 2022. "Forecasting the real prices of crude oil: A robust weighted least squares approach," Energy Economics, Elsevier, vol. 116(C).
    12. Tobias Götze & Marc Gürtler & Eileen Witowski, 2020. "Improving CAT bond pricing models via machine learning," Journal of Asset Management, Palgrave Macmillan, vol. 21(5), pages 428-446, September.
    13. Wen, Danyan & Liu, Li & Wang, Yudong & Zhang, Yaojie, 2022. "Forecasting crude oil market returns: Enhanced moving average technical indicators," Resources Policy, Elsevier, vol. 76(C).
    14. Malakhov, Alexey & Riley, Timothy B. & Yan, Qing, 2024. "Do hedge funds bet against beta?," International Review of Economics & Finance, Elsevier, vol. 93(PA), pages 1507-1525.
    15. Zhu, Haibin & Bai, Lu & He, Lidan & Liu, Zhi, 2023. "Forecasting realized volatility with machine learning: Panel data perspective," Journal of Empirical Finance, Elsevier, vol. 73(C), pages 251-271.
    16. Eghbal Rahimikia & Stefan Zohren & Ser-Huang Poon, 2021. "Realised Volatility Forecasting: Machine Learning via Financial Word Embedding," Papers 2108.00480, arXiv.org, revised Nov 2024.
    17. Daníelsson, Jón & Macrae, Robert & Uthemann, Andreas, 2022. "Artificial intelligence and systemic risk," Journal of Banking & Finance, Elsevier, vol. 140(C).
    18. Cong Wang, 2024. "Stock return prediction with multiple measures using neural network models," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 10(1), pages 1-34, December.
    19. Liu, Yunting & Zhu, Yandi, 2025. "Good idiosyncratic volatility, bad idiosyncratic volatility, and the cross-section of stock returns," Journal of Banking & Finance, Elsevier, vol. 170(C).
    20. Guo, Li & Sang, Bo & Tu, Jun & Wang, Yu, 2024. "Cross-cryptocurrency return predictability," Journal of Economic Dynamics and Control, Elsevier, vol. 163(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2504.09663. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.