Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published April 1, 2022 | Supplemental Material
Journal Article Open

Rapid assessments of light-duty gasoline vehicle emissions using on-road remote sensing and machine learning


In-time and accurate assessments of on-road vehicle emissions play a central role in urban air quality and health policymaking. However, official insight is hampered by the Inspection/Maintenance (I/M) procedure conducted in the laboratory annually. It not only has a large gap to real-world situations (e.g., meteorological conditions) but also is incapable of regular supervision. Here we build a unique dataset including 103,831 light-duty gasoline vehicles, in which on-road remote sensing (ORRS) measurements are linked to the I/M records based on the vehicle identification numbers and license plates. On this basis, we develop an ensemble model framework that integrates three machining learning algorithms, including neural network (NN), extreme gradient boosting (XGBoost), and random forest (RF). We demonstrate that this ensemble model could rapidly assess the vehicle-specific emissions (i.e., CO, HC, and NO). In particular, the model performs quite well for the passing vehicles under normal conditions (i.e., lower VSP (<18 kw/t), temperature (6–32 °C), relative humidity (<80%), and wind speed (<5 m/s)). Together with the current emission standard, we identify a large number of the 'dirty' (2.33%) or 'clean' (74.92%) vehicles in the real world. Our results show that the ORRS measurements, assisted by the machine-learning-based ensemble model developed here, can realize day-to-day supervision of on-road vehicle-specific emissions. This approach framework provides a valuable opportunity to reform the I/M procedures globally and mitigate urban air pollution deeply.

Additional Information

© 2022 Published by Elsevier B.V. Received 21 November 2021, Revised 14 December 2021, Accepted 25 December 2021, Available online 4 January 2022. This study is supported by the Department of Science and Technology of China (No. 2018YFC0213506 and 2018YFC0213503), National Research Program for Key Issues in Air Pollution Control in China (No. DQGG0107) and National Natural Science Foundation of China (No. 21577126 and 41561144004). Pengfei Li is supported by National Natural Science Foundation of China (No. 22006030), Initiation Fund for Introducing Talents of Hebei Agricultural University (412201904), and Hebei Youth Top Fund (BJ2020032). Data availability: Three base machine-learning models are used in this study, including NN, RF, and XGBoost, which are obtained from the TensorFlow library, the Scikit Learn library, and the XGBoost library, respectively. Additional information and help are available by contacting the authors. CRediT authorship contribution statement: S.Y., P.L., and Y.X. designed this research, developed the model, performed the analysis, and wrote the paper. L. J., L. W., X. C., T. H., L. W., Y. Z., M. L., Z. L., Z. S., Y. J., W. L., D. R., and J. H. S. made contributions to discussing and improving this research. The authors declare that they have no conflict of interest.

Attached Files

Supplemental Material - 1-s2.0-S0048969721078505-mmc1.docx


Files (52.3 MB)
Name Size Download all
52.3 MB Download

Additional details

August 22, 2023
October 23, 2023