Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective

My bibliography Save this paper

Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective

Author

Listed:

Erfan Loghmani

Registered:

Abstract

Large language models are being widely used across industries to generate content that contributes directly to key performance metrics, such as conversion rates. Pretrained models, however, often fall short when it comes to aligning with human preferences or optimizing for business objectives. As a result, fine-tuning with good-quality labeled data is essential to guide models to generate content that achieves better results. Controlled experiments, like A/B tests, can provide such data, but they are often expensive and come with significant engineering and logistical challenges. Meanwhile, companies have access to a vast amount of historical (observational) data that remains underutilized. In this work, we study the challenges and opportunities of fine-tuning LLMs using observational data. We show that while observational outcomes can provide valuable supervision, directly fine-tuning models on such data can lead them to learn spurious correlations. We present empirical evidence of this issue using various real-world datasets and propose DeconfoundLM, a method that explicitly removes the effect of known confounders from reward signals. Using simulation experiments, we demonstrate that DeconfoundLM improves the recovery of causal relationships and mitigates failure modes found in fine-tuning methods that ignore or naively incorporate confounding variables. Our findings highlight that while observational data presents risks, with the right causal corrections, it can be a powerful source of signal for LLM alignment. Please refer to the project page for code and related resources.

Suggested Citation

Erfan Loghmani, 2025. "Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective," Papers 2506.00152, arXiv.org.

Handle: RePEc:arx:papers:2506.00152

Download full text from publisher

References listed on IDEAS

Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2017. "Double/Debiased Machine Learning for Treatment and Structural Parameters," NBER Working Papers 23564, National Bureau of Economic Research, Inc.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers CWP28/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers 28/17, Institute for Fiscal Studies.
Ali Goli & Amandeep Singh, 2024. "Frontiers: Can Large Language Models Capture Human Preferences?," Marketing Science, INFORMS, vol. 43(4), pages 709-722, July.
Max H. Farrell & Tengyuan Liang & Sanjog Misra, 2021. "Deep Neural Networks for Estimation and Inference," Econometrica, Econometric Society, vol. 89(1), pages 181-213, January.
- Max H. Farrell & Tengyuan Liang & Sanjog Misra, 2018. "Deep Neural Networks for Estimation and Inference," Papers 1809.09953, arXiv.org, revised Sep 2019.
Elea McDonnell Feit & Ron Berman, 2019. "Test & Roll: Profit-Maximizing A/B Tests," Marketing Science, INFORMS, vol. 38(6), pages 1038-1058, November.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Jelena Bradic & Weijie Ji & Yuqian Zhang, 2021. "High-dimensional Inference for Dynamic Treatment Effects," Papers 2110.04924, arXiv.org, revised May 2023.
Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
Shi, Chengchun & Zhou, Yunzhe & Li, Lexin, 2024. "Testing directed acyclic graph via structural, supervised and generative adversarial learning," LSE Research Online Documents on Economics 119446, London School of Economics and Political Science, LSE Library.
Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
- Gobillon, Laurent & Combes, Pierre-Philippe & Zylberberg, Yanos, 2020. "Urban economics in a historical perspective: Recovering data with machine learning," CEPR Discussion Papers 15308, C.E.P.R. Discussion Papers.
- Pierre-Philippe Combes & Laurent Gobillon & Yanos Zylberberg, 2022. "Urban Economics in a Historical Perspective: Recovering Data with Machine Learning," PSE-Ecole d'économie de Paris (Postprint) halshs-03673240, HAL.
- Pierre-Philippe Combes & Laurent Gobillon & Yanos Zylberberg, 2021. "Urban economics in a historical perspective: Recovering data with machine learning," Working Papers halshs-03231786, HAL.
- Pierre-Philippe Combes & Laurent Gobillon & Yanos Zylberberg, 2022. "Urban Economics in a Historical Perspective: Recovering Data with Machine Learning," Post-Print halshs-03673240, HAL.
- Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2021. "Urban Economics in a Historical Perspective: Recovering Data with Machine Learning," IZA Discussion Papers 14392, Institute of Labor Economics (IZA).
- Pierre-Philippe Combes & Laurent Gobillon & Yanos Zylberberg, 2021. "Urban economics in a historical perspective: Recovering data with machine learning," PSE Working Papers halshs-03231786, HAL.
- Pierre-Philippe Combes & Laurent Gobillon & Yanos Zylberberg, 2022. "Urban Economics in a Historical Perspective: Recovering Data with Machine Learning," SciencePo Working papers Main halshs-03673240, HAL.
Luo, Shanshan & Zhang, Yechi & Li, Wei & Geng, Zhi, 2025. "Multiply robust estimation of causal effects using linked data," Computational Statistics & Data Analysis, Elsevier, vol. 209(C).
Semenova, Vira, 2023. "Debiased machine learning of set-identified linear models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1725-1746.
Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
Zequn Jin & Gaoqian Xu & Xi Zheng & Yahong Zhou, 2025. "Policy Learning under Unobserved Confounding: A Robust and Efficient Approach," Papers 2507.20550, arXiv.org.
Nan Liu & Yanbo Liu & Yuya Sasaki, 2024. "Estimation and Inference for Causal Functions with Multiway Clustered Data," Papers 2409.06654, arXiv.org.
Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
- Heiler, Phillip & Knaus, Michael C., 2022. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," IZA Discussion Papers 15580, Institute of Labor Economics (IZA).
Yikun Zhang & Yen-Chi Chen, 2025. "Doubly Robust Inference on Causal Derivative Effects for Continuous Treatments," Papers 2501.06969, arXiv.org, revised Apr 2025.
Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
- Martin Huber, 2019. "An introduction to flexible methods for policy evaluation," Papers 1910.00641, arXiv.org.
Daniel Jacob, 2021. "CATE meets ML," Digital Finance, Springer, vol. 3(2), pages 99-148, June.
Chenyu Hou, 2023. "Learning and Subjective Expectation Formation: A Recurrent Neural Network Approach," Discussion Papers dp23-13, Department of Economics, Simon Fraser University.
Achim Ahrens & Christian B. Hansen & Mark E. Schaffer & Thomas Wiemann, 2024. "ddml: Double/debiased machine learning in Stata," Stata Journal, StataCorp LLC, vol. 24(1), pages 3-45, March.
- Christian B. Hansen & Mark E. Schaffer & Thomas Wiemann & Achim Ahrens, 2022. "ddml: Double/debiased machine learning in Stata," Swiss Stata Conference 2022 02, Stata Users Group.
- Ahrens, Achim & Hansen, Christian B. & Schaffer, Mark E & Wiemann, Thomas, 2023. "ddml: Double/Debiased Machine Learning in Stata," IZA Discussion Papers 15963, Institute of Labor Economics (IZA).
- Achim Ahrens & Christian B. Hansen & Mark E. Schaffer & Thomas Wiemann, 2023. "ddml: Double/debiased machine learning in Stata," Papers 2301.09397, arXiv.org, revised Jan 2024.
Chen, Jiafeng & Ritzwoller, David M., 2023. "Semiparametric estimation of long-term treatment effects," Journal of Econometrics, Elsevier, vol. 237(2).
Achim Ahrens & Christian B. Hansen & Mark E. Schaffer & Thomas Wiemann, 2025. "Model Averaging and Double Machine Learning," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 40(3), pages 249-269, April.
- Achim Ahrens & Christian B. Hansen & Mark E. Schaffer & Thomas Wiemann, 2024. "Model Averaging and Double Machine Learning," Papers 2401.01645, arXiv.org, revised Sep 2024.
- Ahrens, Achim & Hansen, Christian B. & Schaffer, Mark E & Wiemann, Thomas, 2024. "Model Averaging and Double Machine Learning," IZA Discussion Papers 16714, Institute of Labor Economics (IZA).
Stijn Vansteelandt & Oliver Dukes, 2022. "Authors' reply to the Discussion of ‘Assumption‐lean inference for generalised linear model parameters’ by Vansteelandt and Dukes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 729-739, July.
Jason Poulos & Shuxi Zeng, 2021. "RNN‐based counterfactual prediction, with an application to homestead policy and public schooling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(4), pages 1124-1139, August.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2025-06-16 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2506.00152. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data