IDEAS home Printed from https://ideas.repec.org/p/hal/wpaper/hal-04017151.html
   My bibliography  Save this paper

Statistical error bounds for weighted mean and median, with application to robust aggregation of cryptocurrency data

Author

Listed:
  • Michaël Allouche

    (Kaiko [Paris])

  • Mnacho Echenim

    (LIG - Laboratoire d'Informatique de Grenoble - CNRS - Centre National de la Recherche Scientifique - UGA - Université Grenoble Alpes - Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology - UGA - Université Grenoble Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology - UGA - Université Grenoble Alpes, CAPP - Calculs algorithmes programmes et preuves - LIG - Laboratoire d'Informatique de Grenoble - CNRS - Centre National de la Recherche Scientifique - UGA - Université Grenoble Alpes - Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology - UGA - Université Grenoble Alpes)

  • Emmanuel Gobet

    (CMAP - Centre de Mathématiques Appliquées - Ecole Polytechnique - X - École polytechnique - CNRS - Centre National de la Recherche Scientifique)

  • Anne-Claire Maurice

    (Kaiko [Paris])

Abstract

We study price aggregation methodologies applied to crypto-currency prices with quotations fragmented on different platforms. An intrinsic difficulty is that the price returns and volumes are heavytailed, with many outliers, making averaging and aggregation challenging. While conventional methods rely on Volume-Weighted Average Prices (called VWAPs), or Volume-Weighted Median prices (called VWMs), we develop a new Robust Weighted Median (RWM) estimator that is robust to price and volume outliers. Our study is based on new probabilistic concentration inequalities for weighted means and weighted quantiles under different tail assumptions (heavy tails, sub-gamma tails, sub-Gaussian tails). This justifies that fluctuations of VWAP and VWM are statistically important given the heavy-tailed properties of volumes and/or prices. We show that our RWM estimator overcomes this problem and also satisfies all the desirable properties of a price aggregator. We illustrate the behavior of RWM on synthetic data (within a parametric model close to real data): our estimator achieves a statistical accuracy twice as good as its competitors, and also allows to recover realized volatilities in a very accurate way. Tests on real data are also performed and confirm the good behavior of the estimator on various use cases.

Suggested Citation

  • Michaël Allouche & Mnacho Echenim & Emmanuel Gobet & Anne-Claire Maurice, 2023. "Statistical error bounds for weighted mean and median, with application to robust aggregation of cryptocurrency data," Working Papers hal-04017151, HAL.
  • Handle: RePEc:hal:wpaper:hal-04017151
    Note: View the original document on HAL open archive server: https://hal.science/hal-04017151
    as

    Download full text from publisher

    File URL: https://hal.science/hal-04017151/document
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Frahm, Gabriel & Junker, Markus & Schmidt, Rafael, 2005. "Estimating the tail-dependence coefficient: Properties and pitfalls," Insurance: Mathematics and Economics, Elsevier, vol. 37(1), pages 80-100, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gijbels, Irène & Sznajder, Dominik, 2013. "Testing tail monotonicity by constrained copula estimation," Insurance: Mathematics and Economics, Elsevier, vol. 52(2), pages 338-351.
    2. Ziqiang Xing & Denghua Yan & Cheng Zhang & Gang Wang & Dongdong Zhang, 2015. "Spatial Characterization and Bivariate Frequency Analysis of Precipitation and Runoff in the Upper Huai River Basin, China," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 29(9), pages 3291-3304, July.
    3. Mohamad Haytham Klaho & Hamid R. Safavi & Mohammad H. Golmohammadi & Maamoun Alkntar, 2022. "Comparison between bivariate and trivariate flood frequency analysis using the Archimedean copula functions, a case study of the Karun River in Iran," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 112(2), pages 1589-1610, June.
    4. Tjøstheim, Dag & Hufthammer, Karl Ove, 2013. "Local Gaussian correlation: A new measure of dependence," Journal of Econometrics, Elsevier, vol. 172(1), pages 33-48.
    5. Raza, Hamid & Wu, Weiou, 2018. "Quantile dependence between the stock, bond and foreign exchange markets – Evidence from the UK," The Quarterly Review of Economics and Finance, Elsevier, vol. 69(C), pages 286-296.
    6. Yuri Salazar & Wing Ng, 2015. "Nonparametric estimation of general multivariate tail dependence and applications to financial time series," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(1), pages 121-158, March.
    7. DiTraglia, Francis J. & Gerlach, Jeffrey R., 2013. "Portfolio selection: An extreme value approach," Journal of Banking & Finance, Elsevier, vol. 37(2), pages 305-323.
    8. Dominique Guégan & Matteo Iacopini, 2018. "Nonparameteric forecasting of multivariate probability density functions," Documents de travail du Centre d'Economie de la Sorbonne 18012, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    9. Pavel Krupskii & Harry Joe, 2015. "Tail-weighted measures of dependence," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(3), pages 614-629, March.
    10. Helena Ferreira & Marta Ferreira, 2021. "Tail dependence and smoothness of time series," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 198-210, March.
    11. Chabi-Yo, Fousseni & Ruenzi, Stefan & Weigert, Florian, 2018. "Crash Sensitivity and the Cross Section of Expected Stock Returns," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 53(3), pages 1059-1100, June.
    12. Grobys, Klaus, 2023. "Correlation versus co-fractality: Evidence from foreign-exchange-rate variances," International Review of Financial Analysis, Elsevier, vol. 86(C).
    13. Matthieu Garcin & Maxime L. D. Nicolas, 2021. "Nonparametric estimator of the tail dependence coefficient: balancing bias and variance," Papers 2111.11128, arXiv.org, revised Jul 2023.
    14. L. Vergni & F. Todisco & F. Mannocchi, 2015. "Analysis of agricultural drought characteristics through a two-dimensional copula," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 29(8), pages 2819-2835, June.
    15. A. P. Martins & J. R. Sebastião, 2019. "Methods for estimating the upcrossings index: improvements and comparison," Statistical Papers, Springer, vol. 60(4), pages 1317-1347, August.
    16. Schmid, Friedrich & Schmidt, Rafael, 2007. "Multivariate conditional versions of Spearman's rho and related measures of tail dependence," Journal of Multivariate Analysis, Elsevier, vol. 98(6), pages 1123-1140, July.
    17. Michael C. Munnix & Rudi Schafer, 2011. "A Copula Approach on the Dynamics of Statistical Dependencies in the US Stock Market," Papers 1102.1099, arXiv.org, revised Mar 2011.
    18. Yue Peng & Wing Ng, 2012. "Analysing financial contagion and asymmetric market dependence with volatility indices via copulas," Annals of Finance, Springer, vol. 8(1), pages 49-74, February.
    19. Jalan, Akanksha & Matkovskyy, Roman & Yarovaya, Larisa, 2021. "“Shiny” crypto assets: A systemic look at gold-backed cryptocurrencies during the COVID-19 pandemic," International Review of Financial Analysis, Elsevier, vol. 78(C).
    20. Cerrato, Mario & Crosby, John & Kim, Minjoo & Zhao, Yang, 2015. "US Monetary and Fiscal Policies - Conflict or Cooperation?," 2007 Annual Meeting, July 29-August 1, 2007, Portland, Oregon TN 2015-78, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).

    More about this item

    Keywords

    robust aggregation; weighted mean and quantile estimation; heavy tails; concentration inequalities; outliers;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:wpaper:hal-04017151. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.