IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0277813.html
   My bibliography  Save this article

Effects analysis of reward functions on reinforcement learning for traffic signal control

Author

Listed:
  • Hyosun Lee
  • Yohee Han
  • Youngchan Kim
  • Yong Hoon Kim

Abstract

The increasing traffic demand in urban areas frequently causes traffic congestion, which can be managed only through intelligent traffic signal controls. Although many recent studies have focused on reinforcement learning for traffic signal control (RL-TSC), most have focused on improving performance from an intersection perspective, targeting virtual simulation. The performance indexes from intersection perspectives are averaged by the weighted traffic flow; therefore, if the balance of each movement is not considered, the green time may be overly concentrated on the movements of heavy flow rates. Furthermore, as the ultimate purpose of traffic signal control research is to apply these controls to the real-world intersections, it is necessary to consider the real-world constraints. Hence, this study aims to design RL-TSC considering real-world applicability and confirm the appropriate design of the reward function. The limitations of the detector in the real world and the dual-ring traffic signal system are taken into account in the model design to facilitate real-world application. To design the reward for balancing traffic movements, we define the average delay weighted by traffic volume per lane and entropy of delay in the reward function. Model training is performed at the prototype intersection for ensuring scalability to multiple intersections. The model after prototype pre-training is evaluated by applying it to a network with two intersections without additional training. As a result, the reward function considering the equality of traffic movements shows the best performance. The proposed model reduces the average delay by more than 7.4% and 15.0% compared to the existing real-time adaptive signal control at two intersections, respectively.

Suggested Citation

  • Hyosun Lee & Yohee Han & Youngchan Kim & Yong Hoon Kim, 2022. "Effects analysis of reward functions on reinforcement learning for traffic signal control," PLOS ONE, Public Library of Science, vol. 17(11), pages 1-18, November.
  • Handle: RePEc:plo:pone00:0277813
    DOI: 10.1371/journal.pone.0277813
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0277813
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0277813&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0277813?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Daganzo, Carlos F., 1995. "A finite difference approximation of the kinematic wave model of traffic flow," Transportation Research Part B: Methodological, Elsevier, vol. 29(4), pages 261-276, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhu, Chenqiang & Zhong, Shiquan & Li, Guangyu & Ma, Shoufeng, 2017. "New control strategy for the lattice hydrodynamic model of traffic flow," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 468(C), pages 445-453.
    2. Huanping Li & Jian Wang & Guopeng Bai & Xiaowei Hu, 2021. "Exploring the Distribution of Traffic Flow for Shared Human and Autonomous Vehicle Roads," Energies, MDPI, vol. 14(12), pages 1-21, June.
    3. Georgia Perakis & Guillaume Roels, 2006. "An Analytical Model for Traffic Delays and the Dynamic User Equilibrium Problem," Operations Research, INFORMS, vol. 54(6), pages 1151-1171, December.
    4. Y. W. Xu & J. H. Wu & M. Florian & P. Marcotte & D. L. Zhu, 1999. "Advances in the Continuous Dynamic Network Loading Problem," Transportation Science, INFORMS, vol. 33(4), pages 341-353, November.
    5. Yannis Pavlis & Will Recker, 2009. "A Mathematical Logic Approach for the Transformation of the Linear Conditional Piecewise Functions of Dispersion-and-Store and Cell Transmission Traffic Flow Models into Linear Mixed-Integer Form," Transportation Science, INFORMS, vol. 43(1), pages 98-116, February.
    6. Martínez, Irene & Jin, Wen-Long, 2020. "Optimal location problem for variable speed limit application areas," Transportation Research Part B: Methodological, Elsevier, vol. 138(C), pages 221-246.
    7. Malachy Carey & Paul Humphreys & Marie McHugh & Ronan McIvor, 2018. "Consistency and Inconsistency Between the Fundamental Relationships on Which Different Traffic Assignment Models Are Based," Service Science, INFORMS, vol. 52(6), pages 1548-1569, December.
    8. Flötteröd, G. & Osorio, C., 2017. "Stochastic network link transmission model," Transportation Research Part B: Methodological, Elsevier, vol. 102(C), pages 180-209.
    9. Flötteröd, Gunnar & Rohde, Jannis, 2011. "Operational macroscopic modeling of complex urban road intersections," Transportation Research Part B: Methodological, Elsevier, vol. 45(6), pages 903-922, July.
    10. Cayford, Randall & Lin, Wei-Hua & Daganzo, Carlos F., 1997. "The Netcell Simulation Package: Technical Description," Institute of Transportation Studies, Research Reports, Working Papers, Proceedings qt4j27j106, Institute of Transportation Studies, UC Berkeley.
    11. Boel, René & Mihaylova, Lyudmila, 2006. "A compositional stochastic model for real time freeway traffic simulation," Transportation Research Part B: Methodological, Elsevier, vol. 40(4), pages 319-334, May.
    12. Nicholas Molyneaux & Riccardo Scarinci & Michel Bierlaire, 0. "Design and analysis of control strategies for pedestrian flows," Transportation, Springer, vol. 0, pages 1-41.
    13. Carey, Malachy & Humphreys, Paul & McHugh, Marie & McIvor, Ronan, 2014. "Extending travel-time based models for dynamic network loading and assignment, to achieve adherence to first-in-first-out and link capacities," Transportation Research Part B: Methodological, Elsevier, vol. 65(C), pages 90-104.
    14. Wong, S. C. & Wong, G. C. K., 2002. "An analytical shock-fitting algorithm for LWR kinematic wave model embedded with linear speed-density relationship," Transportation Research Part B: Methodological, Elsevier, vol. 36(8), pages 683-706, September.
    15. Lin, Wei-Hua & Lo, Hong K., 2003. "A theoretical probe of a German experiment on stationary moving traffic jams," Transportation Research Part B: Methodological, Elsevier, vol. 37(3), pages 251-261, March.
    16. Jabari, Saif Eddin & Liu, Henry X., 2013. "A stochastic model of traffic flow: Gaussian approximation and estimation," Transportation Research Part B: Methodological, Elsevier, vol. 47(C), pages 15-41.
    17. Chi-kwong Wong & Yiu-yin Lee, 2020. "Lane-Based Traffic Signal Simulation and Optimization for Preventing Overflow," Mathematics, MDPI, vol. 8(8), pages 1-28, August.
    18. Bar-Gera, Hillel & Carey, Malachy, 2022. "Constructing a cell transmission model solution adhering fully to first-in-first-out conditions," Transportation Research Part B: Methodological, Elsevier, vol. 161(C), pages 247-267.
    19. Garcia-Rodenas, Ricardo & Lopez-Garcia, Maria Luz & Nino-Arbelaez, Alejandro & Verastegui-Rayo, Doroteo, 2006. "A continuous whole-link travel time model with occupancy constraint," European Journal of Operational Research, Elsevier, vol. 175(3), pages 1455-1471, December.
    20. Nicholas Molyneaux & Riccardo Scarinci & Michel Bierlaire, 2021. "Design and analysis of control strategies for pedestrian flows," Transportation, Springer, vol. 48(4), pages 1767-1807, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0277813. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.