IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v52y2008i7p3583-3602.html

Maximum entropy and least square error minimizing procedures for estimating missing conditional probabilities in Bayesian networks

Author

Listed:
  • Pendharkar, Parag C.

Abstract

Conditional probability tables (CPT) in many Bayesian networks often contain missing values. The problem of missing values in CPT is a very common problem and occurs due to the lack of data on certain scenarios that are observed in the real world but are missing in the training data. The current approaches of addressing the problem of missing values in CPT are very restrictive in that they assume certain probability distributions for estimating missing values. Recently, maximum entropy (ME) approaches have been used to learn features of probability distribution functions from the observed data. The ME approaches do not require any data distribution assumptions and are shown to work well for several non-parametric distributions. The ME and least square (LS) error minimizing approaches can be used for estimating missing values in CPT for Bayesian networks. The applications of ME and LS approaches for estimating missing CPT require researchers to solve difficult constrained non-linear optimization problems. These difficult constrained non-linear optimization problems can be solved using genetic algorithms.

Suggested Citation

  • Pendharkar, Parag C., 2008. "Maximum entropy and least square error minimizing procedures for estimating missing conditional probabilities in Bayesian networks," Computational Statistics & Data Analysis, Elsevier, vol. 52(7), pages 3583-3602, March.
  • Handle: RePEc:eee:csdana:v:52:y:2008:i:7:p:3583-3602
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(07)00447-1
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Yang, Zheng & Tian, Zheng & Yuan, Zixia, 2007. "GSA-based maximum likelihood estimation for threshold vector error correction model," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 109-120, September.
    2. Kapetanios, George, 2007. "Variable selection in regression models using nonstandard optimisation of information criteria," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 4-15, September.
    3. Demirtas, Hakan & Arguelles, Lester M. & Chung, Hwan & Hedeker, Donald, 2007. "On the performance of bias-reduction techniques for variance estimation in approximate Bayesian bootstrap imputation," Computational Statistics & Data Analysis, Elsevier, vol. 51(8), pages 4064-4068, May.
    4. Pendharkar, Parag C. & Koehler, Gary J., 2007. "A general steady state distribution based stopping criteria for finite length genetic algorithms," European Journal of Operational Research, Elsevier, vol. 176(3), pages 1436-1451, February.
    5. Formann, Anton K., 2007. "Mixture analysis of multivariate categorical data with covariates and missing entries," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5236-5246, July.
    6. Ximing Wu & Thanasis Stengos, 2005. "Partially adaptive estimation via the maximum entropy densities," Econometrics Journal, Royal Economic Society, vol. 8(3), pages 352-366, December.
    7. Di Zio, Marco & Guarnera, Ugo & Luzi, Orietta, 2007. "Imputation through finite Gaussian mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5305-5316, July.
    8. Krink, Thiemo & Paterlini, Sandra & Resti, Andrea, 2007. "Using differential evolution to improve the accuracy of bank rating systems," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 68-87, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gambelli, Danilo & Alberti, Francesca & Solfanelli, Francesco & Vairo, Daniela & Zanoli, Raffaele, 2017. "Third generation algae biofuels in Italy by 2030: A scenario analysis using Bayesian networks," Energy Policy, Elsevier, vol. 103(C), pages 165-178.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gilli, Manfred & Winker, Peter, 2007. "2nd Special Issue on Applications of Optimization Heuristics to Estimation and Modelling Problems," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 2-3, September.
    2. Manfred GILLI & Peter WINKER, 2008. "A review of heuristic optimization methods in econometrics," Swiss Finance Institute Research Paper Series 08-12, Swiss Finance Institute.
    3. Peter Winker & Marianna Lyra & Chris Sharpe, 2011. "Least median of squares estimation by optimization heuristics with an application to the CAPM and a multi-factor model," Computational Management Science, Springer, vol. 8(1), pages 103-123, April.
    4. Doumpos, M. & Marinakis, Y. & Marinaki, M. & Zopounidis, C., 2009. "An evolutionary approach to construction of outranking models for multicriteria classification: The case of the ELECTRE TRI method," European Journal of Operational Research, Elsevier, vol. 199(2), pages 496-505, December.
    5. Zak-Szatkowska, Malgorzata & Bogdan, Malgorzata, 2011. "Modified versions of the Bayesian Information Criterion for sparse Generalized Linear Models," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2908-2924, November.
    6. Thiemo Krink & Sandra Paterlini, 2011. "Multiobjective optimization using differential evolution for real-world portfolio optimization," Computational Management Science, Springer, vol. 8(1), pages 157-179, April.
    7. Ouysse, Rachida & Kohn, Robert, 2010. "Bayesian variable selection and model averaging in the arbitrage pricing theory model," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3249-3268, December.
    8. Marianna Lyra & Akwum Onwunta & Peter Winker, 2015. "Threshold accepting for credit risk assessment and validation," Journal of Banking Regulation, Palgrave Macmillan, vol. 16(2), pages 130-145, April.
    9. Siddique, Juned & Belin, Thomas R., 2008. "Using an Approximate Bayesian Bootstrap to multiply impute nonignorable missing data," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 405-415, December.
    10. Hong Li & Yanlin Shi, 2022. "Robust information share measures with an application on the international crude oil markets," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 42(4), pages 555-579, April.
    11. Pendharkar, Parag C., 2021. "Allocating fixed costs using multi-coalition epsilon equilibrium," International Journal of Production Economics, Elsevier, vol. 239(C).
    12. Katherine G. Yewell & Steven B. Caudill & Franklin G. Mixon, Jr., 2014. "Referee Bias and Stoppage Time in Major League Soccer: A Partially Adaptive Approach," Econometrics, MDPI, vol. 2(1), pages 1-19, February.
    13. Eklund, Jana & Kapetanios, George, 2008. "A review of forecasting techniques for large datasets," National Institute Economic Review, Cambridge University Press, vol. 203, pages 109-115, January.
    14. Wang, Wan-Lun, 2013. "Mixtures of common factor analyzers for high-dimensional data with missing information," Journal of Multivariate Analysis, Elsevier, vol. 117(C), pages 120-133.
    15. Jouni Kuha & Myrsini Katsikatsou & Irini Moustaki, 2018. "Latent variable modelling with non‐ignorable item non‐response: multigroup response propensity models for cross‐national analysis," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(4), pages 1169-1192, October.
    16. Lyra, M. & Paha, J. & Paterlini, S. & Winker, P., 2010. "Optimization heuristics for determining internal rating grading scales," Computational Statistics & Data Analysis, Elsevier, vol. 54(11), pages 2693-2706, November.
    17. Thiemo Krink & Stefan Mittnik & Sandra Paterlini, 2009. "Differential evolution and combinatorial search for constrained index-tracking," Annals of Operations Research, Springer, vol. 172(1), pages 153-176, November.
    18. Francesco Bartolucci & Giorgio E. Montanari & Silvia Pandolfi, 2018. "Latent Ignorability and Item Selection for Nursing Home Case-Mix Evaluation," Journal of Classification, Springer;The Classification Society, vol. 35(1), pages 172-193, April.
    19. Rompolis, Leonidas S., 2010. "Retrieving risk neutral densities from European option prices based on the principle of maximum entropy," Journal of Empirical Finance, Elsevier, vol. 17(5), pages 918-937, December.
    20. Yener Altunbaş & Salvatore Polizzi & Enzo Scannella & John Thornton, 2022. "European Banking Union and bank risk disclosure: the effects of the Single Supervisory Mechanism," Review of Quantitative Finance and Accounting, Springer, vol. 58(2), pages 649-683, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:52:y:2008:i:7:p:3583-3602. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.