Our Research

Home > Our Research > Latest Research Updates > Content

Latest Research Achievement by the Atmospheric Group Published in Environmental Science & Technology

Release date: April 13, 2026   Author: Xing Peng, Hao-Nan Ma   Read:


Recently, the atmospheric group from the School of Environment and Energy at Peking University Shenzhen Graduate School published a research article entitled “Machine-Learning Source Apportionment of Particulate Pollution Aids Urban Emission Regulations” in Environmental Science & Technology. This study introduces a machine learning (ML)-based source apportionment model that leverages multiscale aerosol composition data to achieve near-real-time tracking and accurate quantification of PM2.5 sources. Based on nearly two decades of observations from the Pearl River Delta (PRD) region in China and California in the United States, the study shows that the model can efficiently replicate receptor-model-based source apportionment results, identifying secondary sulfate and vehicle emissions as dominant sources in the PRD and vehicle emissions, secondary nitrate, and biomass burning as dominant sources in California. It further reveals distinctive trends in the two megacities: Shenzhen has experienced a significant PM2.5 decline over the past decade due to effective control of anthropogenic sources, whereas Los Angeles has exhibited a flattened PM2.5 trend, contributed by intensified wildfire pollution. The ML framework proposed in this paper can efficiently replicate receptor-model-based source apportionment results, support near-real-time source analysis, and provide important support for relevant policy-making.


Research Background

Reducing urban particulate pollution is essential for improving air quality and reducing health risks, and achieving this objective requires accurate identification of the sources contributing to PM2.5. The paper points out that existing source apportionment approaches are often limited by large data requirements, technical complexity, and computational burdens. Although traditional receptor models are widely used, they still face limitations such as the high cost of long-term observations, intricate parameter optimization, reliance on expert judgment, and insufficient timeliness, making it difficult to provide timely support for rapid responses during dynamic pollution processes.

Research Methods

The study constructs an ML-based source apportionment framework. It first prepares a training data set by employing a positive matrix factorization (PMF) model on a PM2.5 chemical composition measurement data set, and then establishes an XGBoost model for each source identified by PMF, using chemical components as input features and source contributions as target variables to establish relationships between each source and component, thereby establishing a direct response relationship between PM2.5 chemical composition and its sources. In this way, once PM2.5 component observations are available, the framework can rapidly apportion the contributions of individual sources. The study systematically validates this framework using Shenzhen’s long-term offline observations from 2014 to 2024, high-time-resolution online observations in 2023, as well as long-term multi-site observations from the PRD region and California.

Research Results

This approach can effectively reproduce traditional PMF source apportionment results and can be applied to both long-term time series and high-time-resolution online measurement data. The paper shows that, for Shenzhen’s long-term observations, the model maintains high accuracy; for online observations, it achieves near-real-time source apportionment and reflects the diurnal and seasonal variations of different sources. In regional applications, the study shows that the major PM2.5 sources in the PRD are secondary sulfate and vehicle emissions, whereas California is mainly influenced by vehicle emissions, secondary nitrate, and biomass burning.

Figure 1. Source apportionment results of PM2.5 in the PRD and California based on the ML model.

Further analysis of trends in Shenzhen and Los Angeles from 2014 to 2024 shows that PM2.5 in Shenzhen declined at a rate of 1.47 μg/m3/year, with most major sources exhibiting downward trends. By contrast, PM2.5 in Los Angeles remained relatively stable overall, while the contribution of biomass burning showed a significant upward trend. The decline in Shenzhen over the past decade is related to effective control of anthropogenic sources, whereas the flattened PM2.5 trend in Los Angeles is influenced by intensified wildfire pollution.

Figure 2. Interannual variations and decadal trends of PM2.5 and its anthropogenic and natural sources in Shenzhen and Los Angeles based on the ML model.

Conclusions and Significance

The paper shows that, while maintaining high consistency with traditional receptor models, this ML framework offers greater computational efficiency and more flexible input requirements, and that, once trained, the ML model can produce source apportionment results for new data within seconds. This approach can be integrated with high-time-resolution online monitoring techniques to enable rapid identification and real-time tracking of pollution sources, and can be used for routine particulate pollution prevention and control as well as emergency management during severe pollution episodes. At the same time, this framework demonstrates strong spatial portability across extended temporal and wide geographical scales, providing technical support for refined urban air quality management and pollution response.

Peng Xing, Associate Researcher at the School of Environment and Energy, Peking University Shenzhen Graduate School, PhD candidate Hao-Nan Ma, and Prof. Ling-Yan He are co-first authors of the paper. Prof. Xiao-Feng Huang of Peking University Shenzhen Graduate School and Prof. Yuan Wang of the Department of Earth System Science at Stanford University are co-corresponding authors. This research was supported by the National Key Research and Development Program of China (2023YFC3709203), the National Natural Science Foundation of China (42407132), and the IER Foundation 2024 (IERF202404).

The original article is available at: https://doi.org/10.1021/acs.est.5c14501


关闭

ADDRESS

School of Environment and Energy, Peking University Shenzhen Graduate School, University Town of Shenzhen, Nanshan District,Shenzhen 518055, P.R.China

Copyright @ School of Environment and Energy, Peking University. Guangdong ICP No. 12081285