When big data meets big smog

A big spatio-temporal data framework for China severe smog analysis

Jiaoyan Chen, Huajun Chen, Jeff Z. Pan, Ming Wu, Ningyu Zhang, Guozhou Zheng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 China's cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a city's long-term pollution level of PM 2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.

Original languageEnglish
Title of host publicationProceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013
EditorsVarun Chandola, Ranga Raju Vatsavai
PublisherAssociation for Computing Machinery
Pages13-22
Number of pages10
ISBN (Print)9781450325349
DOIs
Publication statusPublished - 2013
Event2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013 - Orlando, FL, United States
Duration: 4 Nov 20134 Nov 2013

Conference

Conference2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013
CountryUnited States
CityOrlando, FL
Period4/11/134/11/13

Fingerprint

Air
Air quality
Disasters
Scalability
Pollution
Neural networks
Sensors
Big data
Experiments

Keywords

  • artificial neural network
  • China smog
  • correlation analysis
  • MapReduce
  • spatio-temporal

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Chen, J., Chen, H., Pan, J. Z., Wu, M., Zhang, N., & Zheng, G. (2013). When big data meets big smog: A big spatio-temporal data framework for China severe smog analysis. In V. Chandola, & R. R. Vatsavai (Eds.), Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013 (pp. 13-22). Association for Computing Machinery. https://doi.org/10.1145/2534921.2534924

When big data meets big smog : A big spatio-temporal data framework for China severe smog analysis. / Chen, Jiaoyan; Chen, Huajun; Pan, Jeff Z.; Wu, Ming; Zhang, Ningyu; Zheng, Guozhou.

Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013. ed. / Varun Chandola; Ranga Raju Vatsavai. Association for Computing Machinery, 2013. p. 13-22.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chen, J, Chen, H, Pan, JZ, Wu, M, Zhang, N & Zheng, G 2013, When big data meets big smog: A big spatio-temporal data framework for China severe smog analysis. in V Chandola & RR Vatsavai (eds), Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013. Association for Computing Machinery, pp. 13-22, 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013, Orlando, FL, United States, 4/11/13. https://doi.org/10.1145/2534921.2534924
Chen J, Chen H, Pan JZ, Wu M, Zhang N, Zheng G. When big data meets big smog: A big spatio-temporal data framework for China severe smog analysis. In Chandola V, Vatsavai RR, editors, Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013. Association for Computing Machinery. 2013. p. 13-22 https://doi.org/10.1145/2534921.2534924
Chen, Jiaoyan ; Chen, Huajun ; Pan, Jeff Z. ; Wu, Ming ; Zhang, Ningyu ; Zheng, Guozhou. / When big data meets big smog : A big spatio-temporal data framework for China severe smog analysis. Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013. editor / Varun Chandola ; Ranga Raju Vatsavai. Association for Computing Machinery, 2013. pp. 13-22
@inproceedings{bfb2426296014842be92b80021d3d691,
title = "When big data meets big smog: A big spatio-temporal data framework for China severe smog analysis",
abstract = "Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 China's cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a city's long-term pollution level of PM 2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.",
keywords = "artificial neural network, China smog, correlation analysis, MapReduce, spatio-temporal",
author = "Jiaoyan Chen and Huajun Chen and Pan, {Jeff Z.} and Ming Wu and Ningyu Zhang and Guozhou Zheng",
year = "2013",
doi = "10.1145/2534921.2534924",
language = "English",
isbn = "9781450325349",
pages = "13--22",
editor = "Varun Chandola and Vatsavai, {Ranga Raju}",
booktitle = "Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - When big data meets big smog

T2 - A big spatio-temporal data framework for China severe smog analysis

AU - Chen, Jiaoyan

AU - Chen, Huajun

AU - Pan, Jeff Z.

AU - Wu, Ming

AU - Zhang, Ningyu

AU - Zheng, Guozhou

PY - 2013

Y1 - 2013

N2 - Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 China's cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a city's long-term pollution level of PM 2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.

AB - Recently, the appearing disaster of severe smog has been attacking many cities in China such as the capital Beijing. The chief culprit of China smog, namely PM2.5, is affected by various factors including air pollutants, weather, climate, geographical location, urbanization, etc. To analyze the factors, we collect about 35,000,000 air quality records and about 30,000,000 weather records from the sensors in 77 China's cities in 2013. Moreover, two big data sets named Geoname and DBPedia are also combined for the data of climate, geographical location and urbanization. To deal with big spatio-temporal data for big smog analysis, we propose a MapReduce-based framework named BigSmog. It mainly conducts parallel correlation analysis of the factors and scalable training of artificial neural networks for spatio-temporal approximation of the concentration of PM2.5. In the experiments, BigSmog displays high scalability for big smog analysis with big spatio-temporal data. The analysis result shows that the air pollutants influence the short-term concentration of PM2.5 more than the weather and the factors of geographical location and climate rather than urbanization play a major role in determining a city's long-term pollution level of PM 2.5. Moreover, the trained ANNs can accurately approximate the concentration of PM2.5.

KW - artificial neural network

KW - China smog

KW - correlation analysis

KW - MapReduce

KW - spatio-temporal

UR - http://www.scopus.com/inward/record.url?scp=84896366717&partnerID=8YFLogxK

U2 - 10.1145/2534921.2534924

DO - 10.1145/2534921.2534924

M3 - Conference contribution

SN - 9781450325349

SP - 13

EP - 22

BT - Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2013

A2 - Chandola, Varun

A2 - Vatsavai, Ranga Raju

PB - Association for Computing Machinery

ER -