This project seeks to uncover, characterize, and mitigate the biases of AI models in connected vehicle-infrastructure-pedestrian systems by leveraging advanced statistical machine learning and representation learning techniques and real-world video dataPedestrian safety remains a critical challenge in current transportation systems. In the US, fatal pedestrian crashes have increased by nearly 50% over the past decade. Data shows that children, the elderly, men, people with low income, the homeless, and people of color are involved in far greater pedestrian-vehicle crashes compared to the general population.Autonomous Vehicles (AVs) are expected to effectively detect pedestrians and react to potential accidents. However, due to the constrained mobilities of vulnerable road users, data from vulnerable pedestrians, such as the elderly and children, is often limited. For example, children are more likely to exhibit unpredictable behaviors, and elderly pedestrians on average walk slower than the general population, as shown conceptually in Figure 1. The data scarcity and distinct distributions of these underrepresented and disadvantaged pedestrian groups will make their data minor ``mode" or even``out-of-distribution" compared to the huge amount of training data from other pedestrian groups. This will lead to larger prediction errors for disadvantaged groups during the testing stage. Error-prone detection and trajectory prediction of vulnerable pedestrians may cause discriminatory decision-making against these groups, compromising their safety.
Figure 1 Concepts of out-of-distribution behavior patterns of vulnerable pedestrians. (a) Majority of pedestrian data are from normal pedestrians. (b)The vulnerable pedestrians such as elder/disabled (red) and children (orange)often have distinct crossing behavior patterns resulting in different trajectories. (c) Compared to data distribution of majority normal pedestrians, the data of vulnerable groups are often ``minor mode" or``out-of-distribution".