IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v11y2019i10p2944-d233775.html
   My bibliography  Save this article

AirInsight: Visual Exploration and Interpretation of Latent Patterns and Anomalies in Air Quality Data

Author

Listed:
  • Huijie Zhang

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Key Laboratory of Intelligent Information Processing of Jilin Universities, Changchun 130024, China
    These authors contributed equally to this work.)

  • Ke Ren

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Key Laboratory of Intelligent Information Processing of Jilin Universities, Changchun 130024, China
    These authors contributed equally to this work.)

  • Yiming Lin

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Key Laboratory of Intelligent Information Processing of Jilin Universities, Changchun 130024, China)

  • Dezhan Qu

    (School of Information Science and Technology, Northeast Normal University, Changchun 130024, China
    Library, Northeast Normal University, Changchun 130024, China)

  • Zhenxin Li

    (State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normarl University, Changchun 130024, China)

Abstract

Nowadays, huge volume of air quality data provides unprecedented opportunities for analyzing pollution. However, due to the high complexity, most traditional analytical methods focus on abstracting data, so these techniques discard the original structure and limit the understanding of the results. Visual analysis is a powerful technique for exploring unknown patterns since it retains the details of the original data and gives visual feedback to users. In this paper, we focus on air quality data and propose the AirInsight design, an interactive visual analytic system for recognizing, exploring, and summarizing regular patterns, as well as detecting, classifying, and interpreting abnormal cases. Based on the time-varying and multivariate features of air quality data, a dimension reduction method Composite Least Square Projection (CLSP) is proposed, which allows appreciating and interpreting the data patterns in the context of attributes. On the basis of the observed regular patterns, multiple abnormal cases are further detected, including the multivariate anomalies by the proposed Noise Hierarchical Clustering (NHC) method, abruptly changing timestamps by Time diversity (TD) indicator, and cities with unique patterns by the Geographical Surprise (GS) measure. Moreover, we combine TD and GS to group anomalies based on their underlying spatiotemporal correlations. AirInsight includes multiple coordinated views and rich interactive functions to provide contextual information from different aspects and facilitate a comprehensive understanding. In particular, a pair of glyphs are designed that provide a visual representation of the temporal variation in air quality conditions for a user-selected city. Experiments show that CLSP improves the accuracy of Least Square Projection (LSP) and that NHC has the ability to separate noises. Meanwhile, several case studies and task-based user evaluation demonstrate that our system is effective and practical for exploring and interpreting multivariate spatiotemporal patterns and anomalies in air quality data.

Suggested Citation

  • Huijie Zhang & Ke Ren & Yiming Lin & Dezhan Qu & Zhenxin Li, 2019. "AirInsight: Visual Exploration and Interpretation of Latent Patterns and Anomalies in Air Quality Data," Sustainability, MDPI, vol. 11(10), pages 1-28, May.
  • Handle: RePEc:gam:jsusta:v:11:y:2019:i:10:p:2944-:d:233775
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/11/10/2944/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/11/10/2944/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Philippe Van Kerm, 2003. "Adaptive kernel density estimation," Stata Journal, StataCorp LP, vol. 3(2), pages 148-156, June.
    2. Miller, Clayton & Nagy, Zoltán & Schlueter, Arno, 2018. "A review of unsupervised statistical learning and visual analytics techniques applied to performance analysis of non-residential buildings," Renewable and Sustainable Energy Reviews, Elsevier, vol. 81(P1), pages 1365-1377.
    3. Gutiérrez, Luis & Mena, Ramsés H. & Ruggiero, Matteo, 2016. "A time dependent Bayesian nonparametric model for air quality analysis," Computational Statistics & Data Analysis, Elsevier, vol. 95(C), pages 161-175.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Miltiadis D. Lytras & Anna Visvizi, 2021. "Artificial Intelligence and Cognitive Computing: Methods, Technologies, Systems, Applications and Policy Making," Sustainability, MDPI, vol. 13(7), pages 1-3, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anos-Casero, Paloma & Udomsaph, Charles, 2009. "What drives firm productivity growth ?," Policy Research Working Paper Series 4841, The World Bank.
    2. Menta, Giorgia & Lepinteur, Anthony & Clark, Andrew E. & Ghislandi, Simone & D'Ambrosio, Conchita, 2023. "Maternal genetic risk for depression and child human capital," Journal of Health Economics, Elsevier, vol. 87(C).
    3. Roth, Jonathan & Martin, Amory & Miller, Clayton & Jain, Rishee K., 2020. "SynCity: Using open data to create a synthetic city of hourly building energy estimates by integrating data-driven and physics-based methods," Applied Energy, Elsevier, vol. 280(C).
    4. Maksim Yemelyanau, 2008. "Inequality in Belarus from 1995 to 2005," CERGE-EI Working Papers wp356, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    5. Fabio Clementi & Francesco Schettino, 2013. "Income polarization in Brazil, 2001-2011: A distributional analysis using PNAD data," Economics Bulletin, AccessEcon, vol. 33(3), pages 1796-1815.
    6. Jonathan Roth & Jayashree Chadalawada & Rishee K. Jain & Clayton Miller, 2021. "Uncertainty Matters: Bayesian Probabilistic Forecasting for Residential Smart Meter Prediction, Segmentation, and Behavioral Measurement and Verification," Energies, MDPI, vol. 14(5), pages 1-22, March.
    7. Pacheco, Gail, 2009. "Revisiting the link between minimum wage and wage inequality: Empirical evidence from New Zealand," Economics Letters, Elsevier, vol. 105(3), pages 336-339, December.
    8. Shimshack, Jay P. & Ward, Michael B., 2008. "Enforcement and over-compliance," Journal of Environmental Economics and Management, Elsevier, vol. 55(1), pages 90-105, January.
    9. Yu, Xinran & Ergan, Semiha & Dedemen, Gokmen, 2019. "A data-driven approach to extract operational signatures of HVAC systems and analyze impact on electricity consumption," Applied Energy, Elsevier, vol. 253(C), pages 1-1.
    10. Nicu Sprincean, 2019. "Early Warning Indicators For Macrofinancial Activity In Romania," Review of Economic and Business Studies, Alexandru Ioan Cuza University, Faculty of Economics and Business Administration, issue 23, pages 137-162, June.
    11. Rémi Yin & Anthony Lepinteur & Andrew E Clark & Conchita d'Ambrosio, 2021. "Life Satisfaction and the Human Development Index Across the World," Working Papers halshs-03174513, HAL.
    12. Zhan, Sicheng & Liu, Zhaoru & Chong, Adrian & Yan, Da, 2020. "Building categorization revisited: A clustering-based approach to using smart meter data for building energy benchmarking," Applied Energy, Elsevier, vol. 269(C).
    13. Philippe Van Kerm, 2012. "Kernel-smoothed cumulative distribution function estimation with akdensity," Stata Journal, StataCorp LP, vol. 12(3), pages 543-548, September.
    14. Lordan-Perret, Rebecca & Bärenbold, Rebekka & Weigt, Hannes & Rosner, Robert, 2022. "An Ex-Ante Method to Verify Commercial U.S. Nuclear Power Plant Decommissioning Cost Estimates," Working papers 2022/08, Faculty of Business and Economics - University of Basel.
    15. Camerlenghi, F. & Capasso, V. & Villa, E., 2014. "On the estimation of the mean density of random closed sets," Journal of Multivariate Analysis, Elsevier, vol. 125(C), pages 65-88.
    16. Zahra Barzegar & Firoozeh Rivaz, 2020. "A scalable Bayesian nonparametric model for large spatio-temporal data," Computational Statistics, Springer, vol. 35(1), pages 153-173, March.
    17. Giorgia Menta, 2021. "Poverty in the COVID-19 Era: Real-time Data Analysis on Five European Countries," Research on Economic Inequality, in: Research on Economic Inequality: Poverty, Inequality and Shocks, volume 29, pages 209-247, Emerald Group Publishing Limited.
    18. Deborah A. Cobb-Clark & Vincent A. Hildebrand, 2006. "The Wealth of Mexican Americans," Journal of Human Resources, University of Wisconsin Press, vol. 41(4).
    19. Nicholas J. Cox, 2004. "Speaking Stata: Graphing distributions," Stata Journal, StataCorp LP, vol. 4(1), pages 66-88, March.
    20. Stephen Jenkins & Philippe Kerm, 2005. "Accounting for income distribution trends: A density function decomposition approach," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 3(1), pages 43-61, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:11:y:2019:i:10:p:2944-:d:233775. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.