Application of machine learning models for PM2.5 prediction in bengaluru using precursor air pollutants and meteorological data
Abstract
Air pollution is a significant threat to public health and the environment, especially in rapidly urbanizing areas. This study aimed to predict PM2.5 concentrations using its precursor air pollutants and meteorological parameters through five machine learning models in Bengaluru City, India, utilizing data from 2018 to 2023. The study found that PM2.5 concentrations exceeded the limits prescribed by the World Health Organization (WHO). NO2 concentration has a high effect on PM2.5 prediction. A linear relationship between PM2.5 and both SO2 and NO2 has been observed. Station B4 (Bapuji Nagar) has been identified as a hotspot for PM2.5 and NH3, with consistently high concentrations of these pollutants. Among all input parameters, NO2 showed the strongest correlation with PM2.5, followed by wind direction. PM2.5 and SO2 shared similar long-term sources or processes, while PM2.5 and NO2 exhibited strong correlations at longer periods. The research also utilizes advanced time–frequency domain techniques, including wavelet transforms and spectral analysis. Among the machine learning models evaluated, the ANN (R2 = 0.89) outperformed all other approaches. The order of performance has been observed as ANN > RF > SVR > KNN > MLR. The methods and methodology used can be adapted for PM2.5 prediction in other regions, offering a valuable framework for similar studies globally. © The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2025.