IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0277813.html
   My bibliography  Save this article

Effects analysis of reward functions on reinforcement learning for traffic signal control

Author

Listed:
  • Hyosun Lee
  • Yohee Han
  • Youngchan Kim
  • Yong Hoon Kim

Abstract

The increasing traffic demand in urban areas frequently causes traffic congestion, which can be managed only through intelligent traffic signal controls. Although many recent studies have focused on reinforcement learning for traffic signal control (RL-TSC), most have focused on improving performance from an intersection perspective, targeting virtual simulation. The performance indexes from intersection perspectives are averaged by the weighted traffic flow; therefore, if the balance of each movement is not considered, the green time may be overly concentrated on the movements of heavy flow rates. Furthermore, as the ultimate purpose of traffic signal control research is to apply these controls to the real-world intersections, it is necessary to consider the real-world constraints. Hence, this study aims to design RL-TSC considering real-world applicability and confirm the appropriate design of the reward function. The limitations of the detector in the real world and the dual-ring traffic signal system are taken into account in the model design to facilitate real-world application. To design the reward for balancing traffic movements, we define the average delay weighted by traffic volume per lane and entropy of delay in the reward function. Model training is performed at the prototype intersection for ensuring scalability to multiple intersections. The model after prototype pre-training is evaluated by applying it to a network with two intersections without additional training. As a result, the reward function considering the equality of traffic movements shows the best performance. The proposed model reduces the average delay by more than 7.4% and 15.0% compared to the existing real-time adaptive signal control at two intersections, respectively.

Suggested Citation

  • Hyosun Lee & Yohee Han & Youngchan Kim & Yong Hoon Kim, 2022. "Effects analysis of reward functions on reinforcement learning for traffic signal control," PLOS ONE, Public Library of Science, vol. 17(11), pages 1-18, November.
  • Handle: RePEc:plo:pone00:0277813
    DOI: 10.1371/journal.pone.0277813
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0277813
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0277813&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0277813?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Daganzo, Carlos F., 1995. "A finite difference approximation of the kinematic wave model of traffic flow," Transportation Research Part B: Methodological, Elsevier, vol. 29(4), pages 261-276, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Huanping Li & Jian Wang & Guopeng Bai & Xiaowei Hu, 2021. "Exploring the Distribution of Traffic Flow for Shared Human and Autonomous Vehicle Roads," Energies, MDPI, vol. 14(12), pages 1-21, June.
    2. Georgia Perakis & Guillaume Roels, 2006. "An Analytical Model for Traffic Delays and the Dynamic User Equilibrium Problem," Operations Research, INFORMS, vol. 54(6), pages 1151-1171, December.
    3. Malachy Carey & Paul Humphreys & Marie McHugh & Ronan McIvor, 2018. "Consistency and Inconsistency Between the Fundamental Relationships on Which Different Traffic Assignment Models Are Based," Service Science, INFORMS, vol. 52(6), pages 1548-1569, December.
    4. Flötteröd, G. & Osorio, C., 2017. "Stochastic network link transmission model," Transportation Research Part B: Methodological, Elsevier, vol. 102(C), pages 180-209.
    5. Flötteröd, Gunnar & Rohde, Jannis, 2011. "Operational macroscopic modeling of complex urban road intersections," Transportation Research Part B: Methodological, Elsevier, vol. 45(6), pages 903-922, July.
    6. Boel, René & Mihaylova, Lyudmila, 2006. "A compositional stochastic model for real time freeway traffic simulation," Transportation Research Part B: Methodological, Elsevier, vol. 40(4), pages 319-334, May.
    7. Nicholas Molyneaux & Riccardo Scarinci & Michel Bierlaire, 0. "Design and analysis of control strategies for pedestrian flows," Transportation, Springer, vol. 0, pages 1-41.
    8. Carey, Malachy & Humphreys, Paul & McHugh, Marie & McIvor, Ronan, 2014. "Extending travel-time based models for dynamic network loading and assignment, to achieve adherence to first-in-first-out and link capacities," Transportation Research Part B: Methodological, Elsevier, vol. 65(C), pages 90-104.
    9. Jabari, Saif Eddin & Liu, Henry X., 2013. "A stochastic model of traffic flow: Gaussian approximation and estimation," Transportation Research Part B: Methodological, Elsevier, vol. 47(C), pages 15-41.
    10. Bar-Gera, Hillel & Carey, Malachy, 2022. "Constructing a cell transmission model solution adhering fully to first-in-first-out conditions," Transportation Research Part B: Methodological, Elsevier, vol. 161(C), pages 247-267.
    11. Li, Bing & Wang, Xudong & Feng, Yue & Yin, Juyuan & Gao, Jiandong & Li, Jielong & Bai, Wenqiang, 2025. "Heterogeneous bicycle platoon evolution state estimation model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 661(C).
    12. Nicholas Molyneaux & Riccardo Scarinci & Michel Bierlaire, 2021. "Design and analysis of control strategies for pedestrian flows," Transportation, Springer, vol. 48(4), pages 1767-1807, August.
    13. Carey, Malachy & Watling, David, 2012. "Dynamic traffic assignment approximating the kinematic wave model: System optimum, marginal costs, externalities and tolls," Transportation Research Part B: Methodological, Elsevier, vol. 46(5), pages 634-648.
    14. Carey, Malachy, 2021. "The cell transmission model with free-flow speeds varying over time or space," Transportation Research Part B: Methodological, Elsevier, vol. 147(C), pages 245-257.
    15. Yi, Jingang & Lin, Hao & Alvarez, Luis & Horowitz, Roberto, 2003. "Stability of macroscopic traffic flow modeling through wavefront expansion," Transportation Research Part B: Methodological, Elsevier, vol. 37(7), pages 661-679, August.
    16. Logghe, S. & Immers, L.H., 2008. "Multi-class kinematic wave theory of traffic flow," Transportation Research Part B: Methodological, Elsevier, vol. 42(6), pages 523-541, July.
    17. Daganzo, Carlos F., 2002. "On the Stability of Supply Chains," Institute of Transportation Studies, Research Reports, Working Papers, Proceedings qt1642g95c, Institute of Transportation Studies, UC Berkeley.
    18. Carolina Osorio & Gunnar Flötteröd, 2015. "Capturing Dependency Among Link Boundaries in a Stochastic Dynamic Network Loading Model," Transportation Science, INFORMS, vol. 49(2), pages 420-431, May.
    19. Carey, Malachy & Bar-Gera, Hillel & Watling, David & Balijepalli, Chandra, 2014. "Implementing first-in–first-out in the cell transmission model for networks," Transportation Research Part B: Methodological, Elsevier, vol. 65(C), pages 105-118.
    20. Kang, Chengjun & Qian, Yongsheng & Zeng, Junwei & Wei, Xuting & Zhang, Futao, 2024. "Analysis of stability, energy consumption and CO2 emissions in novel discrete-time car-following model with time delay under V2V environment," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 634(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0277813. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.