Proximal algorithms and temporal difference methods for solving fixed point problems

My bibliography Save this article

Proximal algorithms and temporal difference methods for solving fixed point problems

Author

Listed:

Dimitri P. Bertsekas
(M.I.T.)

Registered:

Abstract

In this paper we consider large fixed point problems and solution with proximal algorithms. We show that for linear problems there is a close connection between proximal iterations, which are prominent in numerical analysis and optimization, and multistep methods of the temporal difference type such as TD( $$\lambda $$ λ ), LSTD( $$\lambda $$ λ ), and LSPE( $$\lambda $$ λ ), which are central in simulation-based exact and approximate dynamic programming. One benefit of this connection is a new and simple way to accelerate the standard proximal algorithm by extrapolation towards a multistep iteration, which generically has a faster convergence rate. Another benefit is the potential for integration into the proximal algorithmic context of several new ideas that have emerged in the approximate dynamic programming context, including simulation-based implementations. Conversely, the analytical and algorithmic insights from proximal algorithms can be brought to bear on the analysis and the enhancement of temporal difference methods. We also generalize our linear case result to nonlinear problems that involve a contractive mapping, thus providing guaranteed and potentially substantial acceleration of the proximal and forward backward splitting algorithms at no extra cost. Moreover, under certain monotonicity assumptions, we extend the connection with temporal difference methods to nonlinear problems through a linearization approach.

Suggested Citation

Dimitri P. Bertsekas, 2018. "Proximal algorithms and temporal difference methods for solving fixed point problems," Computational Optimization and Applications, Springer, vol. 70(3), pages 709-736, July.

Handle: RePEc:spr:coopap:v:70:y:2018:i:3:d:10.1007_s10589-018-9990-5
DOI: 10.1007/s10589-018-9990-5

Download full text from publisher

As the access to this document is restricted, you may want to

for a different version of it.

References listed on IDEAS

Dimitri P. Bertsekas & Huizhen Yu, 2012. "Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 37(1), pages 66-94, February.
David Silver & Aja Huang & Chris J. Maddison & Arthur Guez & Laurent Sifre & George van den Driessche & Julian Schrittwieser & Ioannis Antonoglou & Veda Panneershelvam & Marc Lanctot & Sander Dieleman, 2016. "Mastering the game of Go with deep neural networks and tree search," Nature, Nature, vol. 529(7587), pages 484-489, January.
Huizhen Yu & Dimitri Bertsekas, 2013. "Q-learning and policy iteration algorithms for stochastic shortest path problems," Annals of Operations Research, Springer, vol. 208(1), pages 95-132, September.
Hu, Qinghua & Zhang, Rujia & Zhou, Yucan, 2016. "Transfer learning for short-term wind speed prediction with deep neural networks," Renewable Energy, Elsevier, vol. 85(C), pages 83-95.
Huizhen Yu & Dimitri P. Bertsekas, 2010. "Error Bounds for Approximations from Projected Linear Equations," Mathematics of Operations Research, INFORMS, vol. 35(2), pages 306-329, May.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Dimitri P. Bertsekas, 2019. "Robust shortest path planning and semicontractive dynamic programming," Naval Research Logistics (NRL), John Wiley & Sons, vol. 66(1), pages 15-37, February.
Wellens, Arnoud P. & Udenio, Maxi & Boute, Robert N., 2022. "Transfer learning for hierarchical forecasting: Reducing computational efforts of M5 winning methods," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1482-1491.
Sang-yeon Lee & In-bok Lee & Uk-hyeon Yeo & Jun-gyu Kim & Rack-woo Kim, 2022. "Machine Learning Approach to Predict Air Temperature and Relative Humidity inside Mechanically and Naturally Ventilated Duck Houses: Application of Recurrent Neural Network," Agriculture, MDPI, vol. 12(3), pages 1-19, February.
Huizhen Yu & Dimitri P. Bertsekas, 2015. "A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies," Mathematics of Operations Research, INFORMS, vol. 40(4), pages 926-968, October.
Xiaoyue Li & John M. Mulvey, 2023. "Optimal Portfolio Execution in a Regime-switching Market with Non-linear Impact Costs: Combining Dynamic Program and Neural Network," Papers 2306.08809, arXiv.org.
Nathan Companez & Aldeida Aleti, 2016. "Can Monte-Carlo Tree Search learn to sacrifice?," Journal of Heuristics, Springer, vol. 22(6), pages 783-813, December.
Benjamin Heinbach & Peter Burggräf & Johannes Wagner, 2024. "gym-flp: A Python Package for Training Reinforcement Learning Algorithms on Facility Layout Problems," SN Operations Research Forum, Springer, vol. 5(1), pages 1-26, March.
Liang, Tao & Zhao, Qing & Lv, Qingzhao & Sun, Hexu, 2021. "A novel wind speed prediction strategy based on Bi-LSTM, MOOFADA and transfer learning for centralized control centers," Energy, Elsevier, vol. 230(C).
Ahmed, R. & Sreeram, V. & Mishra, Y. & Arif, M.D., 2020. "A review and evaluation of the state-of-the-art in PV solar power forecasting: Techniques and optimization," Renewable and Sustainable Energy Reviews, Elsevier, vol. 124(C).
Mojtaba Qolipour & Ali Mostafaeipour & Mohammad Saidi-Mehrabad & Hamid R Arabnia, 2019. "Prediction of wind speed using a new Grey-extreme learning machine hybrid algorithm: A case study," Energy & Environment, , vol. 30(1), pages 44-62, February.
Zhewei Zhang & Youngjin Yoo & Kalle Lyytinen & Aron Lindberg, 2021. "The Unknowability of Autonomous Tools and the Liminal Experience of Their Use," Information Systems Research, INFORMS, vol. 32(4), pages 1192-1213, December.
Yuhong Wang & Lei Chen & Hong Zhou & Xu Zhou & Zongsheng Zheng & Qi Zeng & Li Jiang & Liang Lu, 2021. "Flexible Transmission Network Expansion Planning Based on DQN Algorithm," Energies, MDPI, vol. 14(7), pages 1-21, April.
Gokhale, Gargya & Claessens, Bert & Develder, Chris, 2022. "Physics informed neural networks for control oriented thermal modeling of buildings," Applied Energy, Elsevier, vol. 314(C).
Li Xia, 2020. "Risk‐Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2808-2827, December.
Dongxiao Niu & Yi Liang & Wei-Chiang Hong, 2017. "Wind Speed Forecasting Based on EMD and GRNN Optimized by FOA," Energies, MDPI, vol. 10(12), pages 1-18, December.
Lu, Yakai & Tian, Zhe & Zhou, Ruoyu & Liu, Wenjing, 2021. "A general transfer learning-based framework for thermal load prediction in regional energy system," Energy, Elsevier, vol. 217(C).
Neha Soni & Enakshi Khular Sharma & Narotam Singh & Amita Kapoor, 2019. "Impact of Artificial Intelligence on Businesses: from Research, Innovation, Market Deployment to Future Shifts in Business Models," Papers 1905.02092, arXiv.org.
Shengli Liao & Xudong Tian & Benxi Liu & Tian Liu & Huaying Su & Binbin Zhou, 2022. "Short-Term Wind Power Prediction Based on LightGBM and Meteorological Reanalysis," Energies, MDPI, vol. 15(17), pages 1-21, August.
Yin, Linfei & He, Xiaoyu, 2023. "Artificial emotional deep Q learning for real-time smart voltage control of cyber-physical social power systems," Energy, Elsevier, vol. 273(C).
Wang, Kejun & Qi, Xiaoxia & Liu, Hongda & Song, Jiakang, 2018. "Deep belief network based k-means cluster approach for short-term wind power forecasting," Energy, Elsevier, vol. 165(PA), pages 840-852.

More about this item

Keywords

; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:coopap:v:70:y:2018:i:3:d:10.1007_s10589-018-9990-5. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Proximal algorithms and temporal difference methods for solving fixed point problems

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data