IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0087357.html
   My bibliography  Save this article

Mutual Information between Discrete and Continuous Data Sets

Author

Listed:
  • Brian C Ross

Abstract

Mutual information (MI) is a powerful method for detecting relationships between data sets. There are accurate methods for estimating MI that avoid problems with “binning” when both data sets are discrete or when both data sets are continuous. We present an accurate, non-binning MI estimator for the case of one discrete data set and one continuous data set. This case applies when measuring, for example, the relationship between base sequence and gene expression level, or the effect of a cancer drug on patient survival time. We also show how our method can be adapted to calculate the Jensen–Shannon divergence of two or more data sets.

Suggested Citation

  • Brian C Ross, 2014. "Mutual Information between Discrete and Continuous Data Sets," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-5, February.
  • Handle: RePEc:plo:pone00:0087357
    DOI: 10.1371/journal.pone.0087357
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0087357
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0087357&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0087357?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hasan T Abbas & Lejla Alic & Madhav Erraguntla & Jim X Ji & Muhammad Abdul-Ghani & Qammer H Abbasi & Marwa K Qaraqe, 2019. "Predicting long-term type 2 diabetes with support vector machine using oral glucose tolerance test," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-11, December.
    2. Lunacek, Monte & Williams, Lindy & Severino, Joseph & Ficenec, Karen & Ugirumurera, Juliette & Eash, Matthew & Ge, Yanbo & Phillips, Caleb, 2021. "A data-driven operational model for traffic at the Dallas Fort Worth International Airport," Journal of Air Transport Management, Elsevier, vol. 94(C).
    3. Xin Dang & Dao Nguyen & Yixin Chen & Junying Zhang, 2021. "A new Gini correlation between quantitative and qualitative variables," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(4), pages 1314-1343, December.
    4. Banerjee, Ameet Kumar & Dionisio, Andreia & Pradhan, H.K. & Mahapatra, Biplab, 2021. "Hunting the quicksilver: Using textual news and causality analysis to predict market volatility," International Review of Financial Analysis, Elsevier, vol. 77(C).
    5. Philip Cammin & Jingjing Yu & Stefan Voß, 2023. "Tiered prediction models for port vessel emissions inventories," Flexible Services and Manufacturing Journal, Springer, vol. 35(1), pages 142-169, March.
    6. Trizoglou, Pavlos & Liu, Xiaolei & Lin, Zi, 2021. "Fault detection by an ensemble framework of Extreme Gradient Boosting (XGBoost) in the operation of offshore wind turbines," Renewable Energy, Elsevier, vol. 179(C), pages 945-962.
    7. Ao Kong & Robert Azencott & Hongliang Zhu & Xindan Li, 2020. "Pattern recognition in micro-trading behaviors before stock price jumps: A framework based on multivariate time series analysis," Papers 2011.04939, arXiv.org, revised Feb 2021.
    8. Kandula, Shanthan & Krishnamoorthy, Srikumar & Roy, Debjit, 2020. "A Predictive and Prescriptive Analytics Framework for Efficient E-Commerce Order Delivery," IIMA Working Papers WP 2020-11-01, Indian Institute of Management Ahmedabad, Research and Publication Department.
    9. María Isabel Arango & Edier Aristizábal & Federico Gómez, 2021. "Morphometrical analysis of torrential flows-prone catchments in tropical and mountainous terrain of the Colombian Andes by machine learning techniques," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 105(1), pages 983-1012, January.
    10. Wei, Yupeng & Wu, Dazhong, 2023. "Prediction of state of health and remaining useful life of lithium-ion battery using graph convolutional network with dual attention mechanisms," Reliability Engineering and System Safety, Elsevier, vol. 230(C).
    11. Yu-Wen Chen & Yi-Chun Li & Chien-Yu Huang & Chia-Jung Lin & Chia-Jui Tien & Wen-Shiang Chen & Chia-Ling Chen & Keh-Chung Lin, 2023. "Predicting Arm Nonuse in Individuals with Good Arm Motor Function after Stroke Rehabilitation: A Machine Learning Study," IJERPH, MDPI, vol. 20(5), pages 1-12, February.
    12. Tommaso Colombo & Massimiliano Mangone & Andrea Bernetti & Marco Paoloni & Valter Santilli & Laura Palagi, 2019. "Supervised and unsupervised learning to classify scoliosis and healthy subjects based on non-invasive rasterstereography analysis," DIAG Technical Reports 2019-08, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
    13. Wang, Weicheng & Chen, Jinglong & Zhang, Tianci & Liu, Zijun & Wang, Jun & Zhang, Xinwei & He, Shuilong, 2023. "An asymmetrical graph Siamese network for one-classanomaly detection of engine equipment with multi-source fusion," Reliability Engineering and System Safety, Elsevier, vol. 235(C).
    14. Xiaobo Yang & Zhilong Mi & Qingcai He & Binghui Guo & Zhiming Zheng, 2023. "Identification of Vital Genes for NSCLC Integrating Mutual Information and Synergy," Mathematics, MDPI, vol. 11(6), pages 1-15, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0087357. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.