Deep Reinforcement Learning Transit Signal Priority Algorithm for Optimizing Transit Reliability and Speed

Abstract

Transit Signal Priority (TSP) is a broadly used traffic signal control strategy designed for reducing transit delays at signalized intersections. Like in the City of Toronto, Canada, conventional active TSP strategies are usually unconditional, granting priority to all detected transit vehicles. Although unconditional TSP effectively reduces signal delays, this strategy does not guarantee an improvement in transit reliability. Moreover, a simple first-come-first-served (FCFS) logic is commonly used to handle multiple priority requests at the same intersection, providing priority to one direction without considering transit performance in the opposite direction. Instead of focusing solely on alleviating delays in a single direction, this paper proposes a dual-objective two-way TSP algorithm (D2 TSP) that optimizes transit delays and reliability (headway adherence) concurrently using deep reinforcement learning (DRL). The DRL algorithms are enhanced with a coordination algorithm for optimized solutions balancing opposite directions. D2 TSP reacts adaptively to real-time bus performance. The algorithm is trained and tested in a microsimulation environment that models a transit route segment with reliability issues in the City of Toronto. The performance of D2 TSP is compared with three baseline scenarios, one without TSP, one with the current TSP algorithm used in the field in the City of Toronto, and one using DRL models with an FCFS logic. D2 TSP demonstrates its advantages in providing an efficient and balanced solution in reducing headway variability and travel time for both directions.

Hirotaka Ishihara
Hirotaka Ishihara
Incoming MEng EECS student