IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0314005.html
   My bibliography  Save this article

A new method based on physical patterns to impute aerobiological datasets

Author

Listed:
  • Sofia Tagliaferro
  • Adrián Corrochano
  • Pierpaolo Marchetti
  • Alessandro Marcon
  • Soledad Le Clainche

Abstract

Limited research has assessed the accuracy of imputation methods in aerobiological datasets. We conducted a simulation study to evaluate, for the first time, the effectiveness of Gappy Singular Value Decomposition (GSVD), a data-driven approach, comparing it with the moving mean interpolation, a statistical approach. Utilizing complete pollen data from two monitoring stations in northeastern Italy for 2022, we randomly generated missing data considering the combination of various proportions (5%, 10%, 25%) and gap lengths (3, 5, 7, 10 days). We imputed 4800 time series using the GSVD algorithm, specifically implemented for this study, and the moving mean algorithm of the “AeRobiology” R package. We assessed imputation accuracy by calculating the Root Mean Square Error and employed multiple linear regression models to identify factors independently affecting the error (e.g. pollen variability, simulation settings). The results showed that the GSVD was as good as the well-established moving mean method and demonstrated its strong generalization capabilities across different data types. However, the imputation error was primarily influenced by pollen characteristics and location, regardless of the imputation method used. High variability in pollen concentrations and the distribution of missing data negatively affected imputation accuracy. In conclusion, we introduced and tested a novel imputation method, demonstrating comparable performance to the statistical approach in aerobiological data reconstruction. These findings contribute to advancing aerobiological data analysis, highlighting the need for improving imputation methods.

Suggested Citation

  • Sofia Tagliaferro & Adrián Corrochano & Pierpaolo Marchetti & Alessandro Marcon & Soledad Le Clainche, 2024. "A new method based on physical patterns to impute aerobiological datasets," PLOS ONE, Public Library of Science, vol. 19(11), pages 1-14, November.
  • Handle: RePEc:plo:pone00:0314005
    DOI: 10.1371/journal.pone.0314005
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0314005
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0314005&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0314005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0314005. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.