IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i1p236-d1023270.html
   My bibliography  Save this article

Machine-Learning Methods on Noisy and Sparse Data

Author

Listed:
  • Konstantinos Poulinakis

    (University of Nicosia, Nicosia CY-2417, Cyprus)

  • Dimitris Drikakis

    (University of Nicosia, Nicosia CY-2417, Cyprus)

  • Ioannis W. Kokkinakis

    (University of Nicosia, Nicosia CY-2417, Cyprus)

  • Stephen Michael Spottswood

    (Air Force Research Laboratory, Wright Patterson AFB, Greene County, OH 45433-7402, USA)

Abstract

Experimental and computational data and field data obtained from measurements are often sparse and noisy. Consequently, interpolating unknown functions under these restrictions to provide accurate predictions is very challenging. This study compares machine-learning methods and cubic splines on the sparsity of training data they can handle, especially when training samples are noisy. We compare deviation from a true function f using the mean square error, signal-to-noise ratio and the Pearson R 2 coefficient. We show that, given very sparse data, cubic splines constitute a more precise interpolation method than deep neural networks and multivariate adaptive regression splines. In contrast, machine-learning models are robust to noise and can outperform splines after a training data threshold is met. Our study aims to provide a general framework for interpolating one-dimensional signals, often the result of complex scientific simulations or laboratory experiments.

Suggested Citation

  • Konstantinos Poulinakis & Dimitris Drikakis & Ioannis W. Kokkinakis & Stephen Michael Spottswood, 2023. "Machine-Learning Methods on Noisy and Sparse Data," Mathematics, MDPI, vol. 11(1), pages 1-19, January.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:1:p:236-:d:1023270
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/1/236/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/1/236/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Chang-Jui Lin & Hsueh-Fang Chen & Tian-Shyug Lee, 2011. "Forecasting Tourism Demand Using Time Series, Artificial Neural Networks and Multivariate Adaptive Regression Splines:Evidence from Taiwan," International Journal of Business Administration, International Journal of Business Administration, Sciedu Press, vol. 2(2), pages 14-24, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wu, Menglong & Xiong, Jiajie & Li, Ruoyu & Dong, Aihong & Lv, Chang & Sun, Dan & Abdelghany, Ahmed Elsayed & Zhang, Qian & Wang, Yaqiong & Siddique, Kadambot H.M. & Niu, Wenquan, 2024. "Precision forecasting of fertilizer components’ concentrations in mixed variable-rate fertigation through machine learning," Agricultural Water Management, Elsevier, vol. 298(C).
    2. Luke T. Woods & Zeeshan A. Rana, 2023. "Modelling Sign Language with Encoder-Only Transformers and Human Pose Estimation Keypoint Data," Mathematics, MDPI, vol. 11(9), pages 1-28, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eden Xiaoying Jiao & Jason Li Chen, 2019. "Tourism forecasting: A review of methodological developments over the last decade," Tourism Economics, , vol. 25(3), pages 469-492, May.
    2. Rashad Aliyev & Sara Salehi & Rafig Aliyev, 2019. "Development of Fuzzy Time Series Model for Hotel Occupancy Forecasting," Sustainability, MDPI, vol. 11(3), pages 1-13, February.
    3. Oscar Claveria & Enric Monte & Salvador Torra, 2015. "“Self-organizing map analysis of agents’ expectations. Different patterns of anticipation of the 2008 financial crisis”," AQR Working Papers 201508, University of Barcelona, Regional Quantitative Analysis Group, revised Mar 2015.
    4. Oscar Claveria & Enric Monte & Salvador Torra, 2015. "“Regional Forecasting with Support Vector Regressions: The Case of Spain”," IREA Working Papers 201507, University of Barcelona, Research Institute of Applied Economics, revised Jan 2015.
    5. Oscar Claveria & Enric Monte & Salvador Torra, 2016. "Combination forecasts of tourism demand with machine learning models," Applied Economics Letters, Taylor & Francis Journals, vol. 23(6), pages 428-431, April.
    6. Binglei Xie & Yu Sun & Xiaolong Huang & Le Yu & Gangyan Xu, 2020. "Travel Characteristics Analysis and Passenger Flow Prediction of Intercity Shuttles in the Pearl River Delta on Holidays," Sustainability, MDPI, vol. 12(18), pages 1-23, September.
    7. Tea Baldigara, 2013. "Forecasting Tourism Demand in Croatia: A Comparison of Different Extrapolative Methods," Journal of Business Administration Research, Journal of Business Administration Research, Sciedu Press, vol. 2(1), pages 84-92, April.
    8. Yi-Chung Hu, 2021. "Developing grey prediction with Fourier series using genetic algorithms for tourism demand forecasting," Quality & Quantity: International Journal of Methodology, Springer, vol. 55(1), pages 315-331, February.
    9. Oscar Claveria & Enric Monte & Salvador Torra, 2017. "“Regional tourism demand forecasting with machine learning models: Gaussian process regression vs. neural network models in a multiple-input multiple-output setting"," IREA Working Papers 201701, University of Barcelona, Research Institute of Applied Economics, revised Jan 2017.
    10. De Carlo, Manuela & Ferilli, Guido & d'Angella, Francesca & Buscema, Massimo, 2021. "Artificial intelligence to design collaborative strategy: An application to urban destinations," Journal of Business Research, Elsevier, vol. 129(C), pages 936-948.
    11. María Genoveva Millán & María Del Pópulo. Pablo-Romero & Javier Sánchez-Rivas, 2018. "Oleotourism as a Sustainable Product: An Analysis of Its Demand in the South of Spain (Andalusia)," Sustainability, MDPI, vol. 10(1), pages 1-19, January.
    12. Marcos Álvarez-Díaz & Manuel González-Gómez & María Soledad Otero-Giráldez, 2018. "Forecasting International Tourism Demand Using a Non-Linear Autoregressive Neural Network and Genetic Programming," Forecasting, MDPI, vol. 1(1), pages 1-17, September.
    13. Dr. Murat çuhadar & Iclal Cogurcu & Ceyda Kukrer, 2014. "Modelling and Forecasting Cruise Tourism Demand to Izmir by Different Artificial Neural Network Architectures," International Journal of Business and Social Research, LAR Center Press, vol. 4(3), pages 12-28, March.
    14. Oscar Claveria & Enric Monte & Salvador Torra, 2016. "Modelling cross-dependencies between Spain’s regional tourism markets with an extension of the Gaussian process regression model," SERIEs: Journal of the Spanish Economic Association, Springer;Spanish Economic Association, vol. 7(3), pages 341-357, August.
    15. Yi-Chung Hu, 2021. "Forecasting tourism demand using fractional grey prediction models with Fourier series," Annals of Operations Research, Springer, vol. 300(2), pages 467-491, May.
    16. Peng, Bo & Song, Haiyan & Crouch, Geoffrey I., 2014. "A meta-analysis of international tourism demand forecasting and implications for practice," Tourism Management, Elsevier, vol. 45(C), pages 181-193.
    17. Yi-Chung Hu, 2017. "Predicting Foreign Tourists for the Tourism Industry Using Soft Computing-Based Grey–Markov Models," Sustainability, MDPI, vol. 9(7), pages 1-12, July.
    18. Dr. Murat çuhadar & Iclal Cogurcu & Ceyda Kukrer, 2014. "Modelling and Forecasting Cruise Tourism Demand to Izmir by Different Artificial Neural Network Architectures," International Journal of Business and Social Research, MIR Center for Socio-Economic Research, vol. 4(3), pages 12-28, March.
    19. Oscar Claveria & Enric Monte & Salvador Torra, 2014. "“A multivariate neural network approach to tourism demand forecasting”," AQR Working Papers 201410, University of Barcelona, Regional Quantitative Analysis Group, revised May 2014.
    20. Claveria, Oscar & Torra, Salvador, 2014. "Forecasting tourism demand to Catalonia: Neural networks vs. time series models," Economic Modelling, Elsevier, vol. 36(C), pages 220-228.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:1:p:236-:d:1023270. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.