Published December 2025 | Version Published
Journal Article Open

Prediction of ambient PM2.5 chemical components in Southern California using machine learning

  • 1. ROR icon California Institute of Technology
  • 2. ROR icon Jet Propulsion Lab
  • 3. ROR icon University of Toronto
  • 4. ROR icon Brown University
  • 5. ROR icon University of North Carolina at Chapel Hill

Abstract

Fine particulate matter (PM2.5, particulate matter with an aerodynamic diameter ≤2.5 μm) poses major public health and environmental risks, yet the toxicity of its chemical components remains poorly understood due to limited chemical speciation data. In this study we apply an extreme gradient boosting (XGBoost) machine learning framework to predict key PM2.5 components including organic carbon, elemental carbon, nitrate, sulfate, ammonium, and metals, using readily available predictors: total PM2.5 mass concentrations, meteorological variables, trace gas measurements, and indicators of exceptional events (e.g., wildfires, fireworks). Leveraging a decade of data from two monitoring sites in Southern California (Los Angeles and Rubidoux), the models achieved strong predictive performance, particularly for nitrate, ammonium, and elemental carbon. Among the most influential predictors across components were total PM2.5 mass, relative humidity, and boundary layer height. This approach has promise for enhancing satellite remote sensing applications, improving chemical transport model inputs, and generating cost-effective estimates of PM2.5 components during sampling gaps and in regions lacking frequent monitoring. Further research is needed to assess the generalizability of this framework across diverse geographic and climatic settings.

Copyright and License

© 2025 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)

Acknowledgement

Y.L.Y is supported in part by NASA grant 80NM0018D0004. The contributions of S.H. and D.J.D. were carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration (80NM0018D0004).

Data Availability

All code and data are available at https://doi.org/10.5281/zenodo.15758208.

Supplemental Material

Supplementary data (DOCX)

Files

1-s2.0-S2590162125000620-main.pdf

Files (5.6 MB)

Name Size Download all
md5:42b45ded53db418c3e7a85bbf5fecbac
5.6 MB Preview Download

Additional details

Related works

Funding

National Aeronautics and Space Administration
80NM0018D0004

Dates

Accepted
2025-09-18
Available
2025-09-19
Available online
Available
2025-09-25
Version of record

Caltech Custom Metadata

Caltech groups
Division of Geological and Planetary Sciences (GPS)
Publication Status
Published