IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v36y2021i4d10.1007_s00180-021-01096-1.html
   My bibliography  Save this article

Algorithm for error-free determination of the variance of all contiguous subsequences and fixed-length contiguous subsequences for a sequence of industrial measurement data

Author

Listed:
  • Andrzej Chmielowiec

    (Rzeszow University of Technology)

Abstract

The article presents an algorithm for fast and error-free determination of statistics such as the arithmetic mean and variance of all contiguous subsequences and fixed-length contiguous subsequences for a sequence of industrial measurement data. Additionally, it shows that both floating-point and integer representation can be used to perform this kind of statistical calculations. The author proves a theorem on the number of bits of precision that an arithmetic type must have to guarantee error-free determination of the arithmetic mean and variance. The article also presents the extension of Welford’s formula for determining variance for the sliding window method—determining the variance of fixed-length contiguous subsequences. The section dedicated to implementation tests shows the running times of individual algorithms depending on the arithmetic type used. The research shows that the use of integers in calculations makes the determination of the aforementioned statistics much faster.

Suggested Citation

  • Andrzej Chmielowiec, 2021. "Algorithm for error-free determination of the variance of all contiguous subsequences and fixed-length contiguous subsequences for a sequence of industrial measurement data," Computational Statistics, Springer, vol. 36(4), pages 2813-2840, December.
  • Handle: RePEc:spr:compst:v:36:y:2021:i:4:d:10.1007_s00180-021-01096-1
    DOI: 10.1007/s00180-021-01096-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-021-01096-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-021-01096-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Katie Evans & Tanzy Love & Sally Thurston, 2015. "Outlier Identification in Model-Based Cluster Analysis," Journal of Classification, Springer;The Classification Society, vol. 32(1), pages 63-84, April.
    2. Douglas M. Hawkins, 1980. "Critical Values for Identifying Outliers," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 29(1), pages 95-96, March.
    3. Philippe Pébay & Timothy B. Terriberry & Hemanth Kolla & Janine Bennett, 2016. "Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights," Computational Statistics, Springer, vol. 31(4), pages 1305-1325, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Denys Baranovskyi & Maryna Bulakh & Adam Michajłyszyn & Sergey Myamlin & Leonty Muradian, 2023. "Determination of the Risk of Failures of Locomotive Diesel Engines in Maintenance," Energies, MDPI, vol. 16(13), pages 1-14, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Durgesh Samariya & Amit Thakkar, 2023. "A Comprehensive Survey of Anomaly Detection Algorithms," Annals of Data Science, Springer, vol. 10(3), pages 829-850, June.
    2. Sisman, S. & Aydinoglu, A.C., 2022. "Improving performance of mass real estate valuation through application of the dataset optimization and Spatially Constrained Multivariate Clustering Analysis," Land Use Policy, Elsevier, vol. 119(C).
    3. Katharine M. Clark & Paul D. McNicholas, 2024. "Finding Outliers in Gaussian Model-based Clustering," Journal of Classification, Springer;The Classification Society, vol. 41(2), pages 313-337, July.
    4. Gasser, Patrick, 2020. "A review on energy security indices to compare country performances," Energy Policy, Elsevier, vol. 139(C).
    5. Karol Pilot & Alicja Ganczarek-Gamrot & Krzysztof Kania, 2024. "Dealing with Anomalies in Day-Ahead Market Prediction Using Machine Learning Hybrid Model," Energies, MDPI, vol. 17(17), pages 1-20, September.
    6. Jizhang Wang & Yun Zhang & Rongrong Gu, 2020. "Research Status and Prospects on Plant Canopy Structure Measurement Using Visual Sensors Based on Three-Dimensional Reconstruction," Agriculture, MDPI, vol. 10(10), pages 1-27, October.
    7. Mehdi Jabbari Nooghabi, 2016. "Estimation of the Lomax Distribution in the Presence of Outliers," Annals of Data Science, Springer, vol. 3(4), pages 385-399, December.
    8. Marc Chataigner & Stephane Crepey & Jiang Pu, 2020. "Nowcasting Networks," Papers 2011.13687, arXiv.org.
    9. St'ephane Cr'epey & Lehdili Noureddine & Nisrine Madhar & Maud Thomas, 2022. "Anomaly Detection on Financial Time Series by Principal Component Analysis and Neural Networks," Papers 2209.11686, arXiv.org, revised Oct 2022.
    10. Nirpeksh Kumar, 2019. "Exact distributions of tests of outliers for exponential samples," Statistical Papers, Springer, vol. 60(6), pages 2031-2061, December.
    11. Stanley Munamato Mbiva & Fabio Mathias Correa, 2024. "Machine Learning to Enhance the Detection of Terrorist Financing and Suspicious Transactions in Migrant Remittances," JRFM, MDPI, vol. 17(5), pages 1-19, April.
    12. Cihangir Koycegiz & Meral Buyukyildiz, 2023. "Investigation of spatiotemporal variability of some precipitation indices in Seyhan Basin, Turkey: monotonic and sub-trend analysis," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 116(2), pages 2211-2244, March.
    13. Taha Yehia & Ali Wahba & Sondos Mostafa & Omar Mahmoud, 2022. "Suitability of Different Machine Learning Outlier Detection Algorithms to Improve Shale Gas Production Data for Effective Decline Curve Analysis," Energies, MDPI, vol. 15(23), pages 1-25, November.
    14. Damian Przekop, 2020. "Feature Engineering for Anti-Fraud Models Based on Anomaly Detection," Central European Journal of Economic Modelling and Econometrics, Central European Journal of Economic Modelling and Econometrics, vol. 12(3), pages 301-316, September.
    15. Francesca Ieva & Anna Maria Paganoni, 2020. "Component-wise outlier detection methods for robustifying multivariate functional samples," Statistical Papers, Springer, vol. 61(2), pages 595-614, April.
    16. Carlo Mari & Cristiano Baldassari, 2021. "Ensemble Methods for Jump-Diffusion Models of Power Prices," Energies, MDPI, vol. 14(8), pages 1-17, April.
    17. Beata Gavurova & Jaroslav Belas & Katarina Zvarikova & Martin Rigelsky & Viera Ivankova, 2021. "The Effect of Education and R&D on Tourism Spending in OECD Countries: An Empirical Study," The AMFITEATRU ECONOMIC journal, Academy of Economic Studies - Bucharest, Romania, vol. 23(58), pages 806-806, August.
    18. Gaucher, Solenne & Klopp, Olga & Robin, Geneviève, 2021. "Outlier detection in networks with missing links," Computational Statistics & Data Analysis, Elsevier, vol. 164(C).
    19. Marc Chataigner & Stéphane Crépey & Jiang Pu, 2020. "Nowcasting Networks," Post-Print hal-03910123, HAL.
    20. Greco, Salvatore & Ishizaka, Alessio & Tasiou, Menelaos & Torrisi, Gianpiero, 2019. "Sigma-Mu efficiency analysis: A methodology for evaluating units through composite indicators," European Journal of Operational Research, Elsevier, vol. 278(3), pages 942-960.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:36:y:2021:i:4:d:10.1007_s00180-021-01096-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.