IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v11y2019i10p2944-d233775.html
   My bibliography  Save this article

AirInsight: Visual Exploration and Interpretation of Latent Patterns and Anomalies in Air Quality Data

Author

Listed:
  • Huijie Zhang

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Key Laboratory of Intelligent Information Processing of Jilin Universities, Changchun 130024, China
    These authors contributed equally to this work.)

  • Ke Ren

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Key Laboratory of Intelligent Information Processing of Jilin Universities, Changchun 130024, China
    These authors contributed equally to this work.)

  • Yiming Lin

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Key Laboratory of Intelligent Information Processing of Jilin Universities, Changchun 130024, China)

  • Dezhan Qu

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Library, Northeast Normal University, Changchun 130024, China)

  • Zhenxin Li

    (State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normarl University, Changchun 130024, China)

Abstract

Nowadays, huge volume of air quality data provides unprecedented opportunities for analyzing pollution. However, due to the high complexity, most traditional analytical methods focus on abstracting data, so these techniques discard the original structure and limit the understanding of the results. Visual analysis is a powerful technique for exploring unknown patterns since it retains the details of the original data and gives visual feedback to users. In this paper, we focus on air quality data and propose the AirInsight design, an interactive visual analytic system for recognizing, exploring, and summarizing regular patterns, as well as detecting, classifying, and interpreting abnormal cases. Based on the time-varying and multivariate features of air quality data, a dimension reduction method Composite Least Square Projection (CLSP) is proposed, which allows appreciating and interpreting the data patterns in the context of attributes. On the basis of the observed regular patterns, multiple abnormal cases are further detected, including the multivariate anomalies by the proposed Noise Hierarchical Clustering (NHC) method, abruptly changing timestamps by Time diversity (TD) indicator, and cities with unique patterns by the Geographical Surprise (GS) measure. Moreover, we combine TD and GS to group anomalies based on their underlying spatiotemporal correlations. AirInsight includes multiple coordinated views and rich interactive functions to provide contextual information from different aspects and facilitate a comprehensive understanding. In particular, a pair of glyphs are designed that provide a visual representation of the temporal variation in air quality conditions for a user-selected city. Experiments show that CLSP improves the accuracy of Least Square Projection (LSP) and that NHC has the ability to separate noises. Meanwhile, several case studies and task-based user evaluation demonstrate that our system is effective and practical for exploring and interpreting multivariate spatiotemporal patterns and anomalies in air quality data.

Suggested Citation

  • Huijie Zhang & Ke Ren & Yiming Lin & Dezhan Qu & Zhenxin Li, 2019. "AirInsight: Visual Exploration and Interpretation of Latent Patterns and Anomalies in Air Quality Data," Sustainability, MDPI, vol. 11(10), pages 1-28, May.
  • Handle: RePEc:gam:jsusta:v:11:y:2019:i:10:p:2944-:d:233775
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/11/10/2944/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/11/10/2944/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Philippe Van Kerm, 2003. "Adaptive kernel density estimation," Stata Journal, StataCorp LP, vol. 3(2), pages 148-156, June.
    2. Miller, Clayton & Nagy, Zoltán & Schlueter, Arno, 2018. "A review of unsupervised statistical learning and visual analytics techniques applied to performance analysis of non-residential buildings," Renewable and Sustainable Energy Reviews, Elsevier, vol. 81(P1), pages 1365-1377.
    3. Gutiérrez, Luis & Mena, Ramsés H. & Ruggiero, Matteo, 2016. "A time dependent Bayesian nonparametric model for air quality analysis," Computational Statistics & Data Analysis, Elsevier, vol. 95(C), pages 161-175.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Miltiadis D. Lytras & Anna Visvizi, 2021. "Artificial Intelligence and Cognitive Computing: Methods, Technologies, Systems, Applications and Policy Making," Sustainability, MDPI, vol. 13(7), pages 1-3, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rémi Yin & Anthony Lepinteur & Andrew E Clark & Conchita d'Ambrosio, 2021. "Life Satisfaction and the Human Development Index Across the World," Working Papers halshs-03174513, HAL.
    2. Anos-Casero, Paloma & Udomsaph, Charles, 2009. "What drives firm productivity growth ?," Policy Research Working Paper Series 4841, The World Bank.
    3. Menta, Giorgia & Lepinteur, Anthony & Clark, Andrew E. & Ghislandi, Simone & D'Ambrosio, Conchita, 2023. "Maternal genetic risk for depression and child human capital," Journal of Health Economics, Elsevier, vol. 87(C).
    4. Hyslop, Dean & Stillman, Steven, 2007. "Youth minimum wage reform and the labour market in New Zealand," Labour Economics, Elsevier, vol. 14(2), pages 201-230, April.
    5. Clementi, Fabio & Molini, Vasco & Schettino, Francesco, 2018. "All that Glitters is not Gold: Polarization Amid Poverty Reduction in Ghana," World Development, Elsevier, vol. 102(C), pages 275-291.
    6. Capozzoli, Alfonso & Piscitelli, Marco Savino & Brandi, Silvio & Grassi, Daniele & Chicco, Gianfranco, 2018. "Automated load pattern learning and anomaly detection for enhancing energy management in smart buildings," Energy, Elsevier, vol. 157(C), pages 336-352.
    7. Roth, Jonathan & Martin, Amory & Miller, Clayton & Jain, Rishee K., 2020. "SynCity: Using open data to create a synthetic city of hourly building energy estimates by integrating data-driven and physics-based methods," Applied Energy, Elsevier, vol. 280(C).
    8. Maksim Yemelyanau, 2008. "Inequality in Belarus from 1995 to 2005," CERGE-EI Working Papers wp356, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    9. Fabio Clementi & Francesco Schettino, 2013. "Income polarization in Brazil, 2001-2011: A distributional analysis using PNAD data," Economics Bulletin, AccessEcon, vol. 33(3), pages 1796-1815.
    10. Falko Juessen, 2009. "A distribution dynamics approach to regional GDP convergence in unified Germany," Empirical Economics, Springer, vol. 37(3), pages 627-652, December.
    11. Ganguli, Ina & Terrell, Katherine, 2006. "Institutions, markets and men's and women's wage inequality: Evidence from Ukraine," Journal of Comparative Economics, Elsevier, vol. 34(2), pages 200-227, June.
    12. Jonathan Roth & Jayashree Chadalawada & Rishee K. Jain & Clayton Miller, 2021. "Uncertainty Matters: Bayesian Probabilistic Forecasting for Residential Smart Meter Prediction, Segmentation, and Behavioral Measurement and Verification," Energies, MDPI, vol. 14(5), pages 1-22, March.
    13. Fan, Cheng & Xiao, Fu & Yan, Chengchu & Liu, Chengliang & Li, Zhengdao & Wang, Jiayuan, 2019. "A novel methodology to explain and evaluate data-driven building energy performance models based on interpretable machine learning," Applied Energy, Elsevier, vol. 235(C), pages 1551-1560.
    14. Pacheco, Gail, 2009. "Revisiting the link between minimum wage and wage inequality: Empirical evidence from New Zealand," Economics Letters, Elsevier, vol. 105(3), pages 336-339, December.
    15. Andrew E. Clark & Conchita D'Ambrosio & Simone Ghislandi & Anthony Lepinteur & Giorgia Menta, 2021. "Maternal depression and child human capital: a genetic instrumental-variable approach," CEP Discussion Papers dp1749, Centre for Economic Performance, LSE.
    16. Shimshack, Jay P. & Ward, Michael B., 2008. "Enforcement and over-compliance," Journal of Environmental Economics and Management, Elsevier, vol. 55(1), pages 90-105, January.
    17. Yu, Xinran & Ergan, Semiha & Dedemen, Gokmen, 2019. "A data-driven approach to extract operational signatures of HVAC systems and analyze impact on electricity consumption," Applied Energy, Elsevier, vol. 253(C), pages 1-1.
    18. Camerlenghi, Federico & Lijoi, Antonio & Prünster, Igor, 2017. "Bayesian prediction with multiple-samples information," Journal of Multivariate Analysis, Elsevier, vol. 156(C), pages 18-28.
    19. Vangelis Marinakis, 2020. "Big Data for Energy Management and Energy-Efficient Buildings," Energies, MDPI, vol. 13(7), pages 1-18, March.
    20. Yunbo Yang & Rongling Li & Tao Huang, 2020. "Smart Meter Data Analysis of a Building Cluster for Heating Load Profile Quantification and Peak Load Shifting," Energies, MDPI, vol. 13(17), pages 1-20, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:11:y:2019:i:10:p:2944-:d:233775. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.