IDEAS home Printed from https://ideas.repec.org/a/jss/jstsof/v066i10.html
   My bibliography  Save this article

dawai: An R Package for Discriminant Analysis with Additional Information

Author

Listed:
  • Conde, David
  • Fernández, Miguel
  • Salvador, Bonifacio
  • Rueda, Cristina

Abstract

The incorporation of additional information into discriminant rules is receiving increasing attention as the rules including this information perform better than the usual rules. In this paper we introduce an R package called dawai, which provides the functions that allow to define the rules that take into account this additional information expressed in terms of restrictions on the means, to classify the samples and to evaluate the accuracy of the results. Moreover, in this paper we extend the results and definitions given in previous papers (Fernández, Rueda, and Salvador 2006, Conde, Fernández, Rueda, and Salvador 2012, Conde, Salvador, Rueda, and Fernández 2013) to the case of unequal covariances among the populations, and consequently define the corresponding restricted quadratic discriminant rules. We also define estimators of the accuracy of the rules for the general more than two populations case. The wide range of applications of these procedures is illustrated with two data sets from two different fields, i.e., biology and pattern recognition.

Suggested Citation

  • Conde, David & Fernández, Miguel & Salvador, Bonifacio & Rueda, Cristina, 2015. "dawai: An R Package for Discriminant Analysis with Additional Information," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 66(i10).
  • Handle: RePEc:jss:jstsof:v:066:i10
    DOI: http://hdl.handle.net/10.18637/jss.v066.i10
    as

    Download full text from publisher

    File URL: https://www.jstatsoft.org/index.php/jss/article/view/v066i10/v66i10.pdf
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v066i10/dawai_1.2.tar.gz
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v066i10/v66i10.R
    Download Restriction: no

    File URL: https://libkey.io/http://hdl.handle.net/10.18637/jss.v066.i10?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Borra, Simone & Di Ciaccio, Agostino, 2010. "Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 2976-2989, December.
    2. Rueda, Cristina & Fernández, Miguel A. & Peddada, Shyamal Das, 2009. "Estimation of Parameters Subject to Order Restrictions on a Circle With Application to Estimation of Phase Angles of Cell Cycle Genes," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 338-347.
    3. Barragán, Sandra & Fernández, Miguel & Rueda, Cristina & Peddada, Shyamal, 2013. "isocir: An R Package for Constrained Inference Using Isotonic Regression for Circular Data, with an Application to Cell Biology," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 54(i04).
    4. Fernandez, Miguel A. & Rueda, Cristina & Salvador, Bonifacio, 2006. "Incorporating Additional Information to Normal Linear Discriminant Rules," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 569-577, June.
    5. El Barmi, Hammou & Johnson, Matthew & Mukerjee, Hari, 2010. "Estimating cumulative incidence functions when the life distributions are constrained," Journal of Multivariate Analysis, Elsevier, vol. 101(9), pages 1903-1909, October.
    6. Ori Davidov & Shyamal Peddada, 2013. "Testing for the Multivariate Stochastic Order among Ordered Experimental Groups with Application to Dose–Response Studies," Biometrics, The International Biometric Society, vol. 69(4), pages 982-990, December.
    7. Kim, Ji-Hyun, 2009. "Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap," Computational Statistics & Data Analysis, Elsevier, vol. 53(11), pages 3735-3745, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. David Conde & Miguel A. Fernández & Cristina Rueda & Bonifacio Salvador, 2021. "Isotonic boosting classification rules," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(2), pages 289-313, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Usta, Ilhan & Kantar, Yeliz Mert, 2011. "On the performance of the flexible maximum entropy distributions within partially adaptive estimation," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2172-2182, June.
    2. Ha, Tran Vinh & Asada, Takumi & Arimura, Mikiharu, 2019. "Determination of the influence factors on household vehicle ownership patterns in Phnom Penh using statistical and machine learning methods," Journal of Transport Geography, Elsevier, vol. 78(C), pages 70-86.
    3. Conde David & Salvador Bonifacio & Rueda Cristina & Fernández Miguel A., 2013. "Performance and estimation of the true error rate of classification rules built with additional information. An application to a cancer trial," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(5), pages 583-602, October.
    4. Mark G E White & Neil E Bezodis & Jonathon Neville & Huw Summers & Paul Rees, 2022. "Determining jumping performance from a single body-worn accelerometer using machine learning," PLOS ONE, Public Library of Science, vol. 17(2), pages 1-25, February.
    5. Airola, Antti & Pahikkala, Tapio & Waegeman, Willem & De Baets, Bernard & Salakoski, Tapio, 2011. "An experimental comparison of cross-validation techniques for estimating the area under the ROC curve," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1828-1844, April.
    6. Abbasabadi, Narjes & Ashayeri, Mehdi & Azari, Rahman & Stephens, Brent & Heidarinejad, Mohammad, 2019. "An integrated data-driven framework for urban energy use modeling (UEUM)," Applied Energy, Elsevier, vol. 253(C), pages 1-1.
    7. Bergmeir, Christoph & Costantini, Mauro & Benítez, José M., 2014. "On the usefulness of cross-validation for directional forecast evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 76(C), pages 132-143.
    8. Matthias Schmid & Thomas Hielscher & Thomas Augustin & Olaf Gefeller, 2011. "A Robust Alternative to the Schemper–Henderson Estimator of Prediction Error," Biometrics, The International Biometric Society, vol. 67(2), pages 524-535, June.
    9. Arthur Pewsey & Eduardo García-Portugués, 2021. "Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 1-58, March.
    10. Luts, Jan & Ormerod, John T., 2014. "Mean field variational Bayesian inference for support vector machine classification," Computational Statistics & Data Analysis, Elsevier, vol. 73(C), pages 163-176.
    11. David Rios Insua & Roi Naveiro & Victor Gallego, 2020. "Perspectives on Adversarial Classification," Mathematics, MDPI, vol. 8(11), pages 1-21, November.
    12. Matthijs J. Warrens & Bunga C. Pratiwi, 2016. "Kappa Coefficients for Circular Classifications," Journal of Classification, Springer;The Classification Society, vol. 33(3), pages 507-522, October.
    13. John J Nay & Yevgeniy Vorobeychik, 2016. "Predicting Human Cooperation," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-19, May.
    14. Melissa Adelman & Francisco Haimovich & Andres Ham & Emmanuel Vazquez, 2018. "Predicting school dropout with administrative data: new evidence from Guatemala and Honduras," Education Economics, Taylor & Francis Journals, vol. 26(4), pages 356-372, July.
    15. Matthew Tuson & Berwin Turlach & Kevin Murray & Mei Ruu Kok & Alistair Vickery & David Whyatt, 2021. "Predicting Future Geographic Hotspots of Potentially Preventable Hospitalisations Using All Subset Model Selection and Repeated K-Fold Cross-Validation," IJERPH, MDPI, vol. 18(19), pages 1-21, September.
    16. Lauri Nevasalmi, 2022. "Recession forecasting with high‐dimensional data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(4), pages 752-764, July.
    17. Hosseini, Fatemeh & Eidsvik, Jo & Mohammadzadeh, Mohsen, 2011. "Approximate Bayesian inference in spatial GLMM with skew normal latent variables," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1791-1806, April.
    18. Nader Salari & Shamarina Shohaimi & Farid Najafi & Meenakshii Nallappan & Isthrinayagy Karishnarajah, 2014. "A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-Nearest Neighbor and Developed Backpropagation Neural Network," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-50, November.
    19. Gonzalo Perez-de-la-Cruz & Guillermina Eslava-Gomez, 2019. "Discriminant analysis for discrete variables derived from a tree-structured graphical model," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 855-876, December.
    20. Keunhyun Park & Sadegh Sabouri & Torrey Lyons & Guang Tian & Reid Ewing, 2020. "Intrazonal or interzonal? Improving intrazonal travel forecast in a four-step travel demand model," Transportation, Springer, vol. 47(5), pages 2087-2108, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:jss:jstsof:v:066:i10. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum (email available below). General contact details of provider: http://www.jstatsoft.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.