Abstract
Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 China's cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a city's long-term pollution level of PM 2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013 |
Editors | Varun Chandola, Ranga Raju Vatsavai |
Publisher | Association for Computing Machinery |
Pages | 13-22 |
Number of pages | 10 |
ISBN (Print) | 9781450325349 |
DOIs | |
Publication status | Published - 2013 |
Event | 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013 - Orlando, FL, United States Duration: 4 Nov 2013 → 4 Nov 2013 |
Conference
Conference | 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013 |
---|---|
Country | United States |
City | Orlando, FL |
Period | 4/11/13 → 4/11/13 |
Fingerprint
Keywords
- artificial neural network
- China smog
- correlation analysis
- MapReduce
- spatio-temporal
ASJC Scopus subject areas
- Computer Graphics and Computer-Aided Design
- Information Systems
Cite this
When big data meets big smog : A big spatio-temporal data framework for China severe smog analysis. / Chen, Jiaoyan; Chen, Huajun; Pan, Jeff Z.; Wu, Ming; Zhang, Ningyu; Zheng, Guozhou.
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013. ed. / Varun Chandola; Ranga Raju Vatsavai. Association for Computing Machinery, 2013. p. 13-22.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - When big data meets big smog
T2 - A big spatio-temporal data framework for China severe smog analysis
AU - Chen, Jiaoyan
AU - Chen, Huajun
AU - Pan, Jeff Z.
AU - Wu, Ming
AU - Zhang, Ningyu
AU - Zheng, Guozhou
PY - 2013
Y1 - 2013
N2 - Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 China's cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a city's long-term pollution level of PM 2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.
AB - Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 China's cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a city's long-term pollution level of PM 2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.
KW - artificial neural network
KW - China smog
KW - correlation analysis
KW - MapReduce
KW - spatio-temporal
UR - http://www.scopus.com/inward/record.url?scp=84896366717&partnerID=8YFLogxK
U2 - 10.1145/2534921.2534924
DO - 10.1145/2534921.2534924
M3 - Conference contribution
SN - 9781450325349
SP - 13
EP - 22
BT - Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013
A2 - Chandola, Varun
A2 - Vatsavai, Ranga Raju
PB - Association for Computing Machinery
ER -