Advanced Search
MyIDEAS: Login to save this book or follow this series

Principles of Data Mining


Author Info

  • David J. Hand

    (Imperial College)

  • Heikki Mannila

    (Helsinki University of Technology)

  • Padhraic Smyth

    (University of California, Irvine)

Registered author(s):


    The growing interest in data mining is motivated by a common problem across disciplines: how does one store, access, model, and ultimately describe and understand very large data sets? Historically, different aspects of data mining have been addressed independently by different disciplines. This is the first truly interdisciplinary text on data mining, blending the contributions of information science, computer science, and statistics. The book consists of three sections. The first, foundations, provides a tutorial overview of the principles underlying data mining algorithms and their application. The presentation emphasizes intuition rather than rigor. The second section, data mining algorithms, shows how algorithms are constructed to solve specific problems in a principled manner. The algorithms covered include trees and rules for classification and regression, association rules, belief networks, classical statistical models, nonlinear models such as neural networks, and local "memory-based" models. The third section shows how all of the preceding analysis fits together when applied to real-world data mining problems. Topics include the role of metadata, how to handle missing data, and data preprocessing.

    Download Info

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below under "Related research" whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Bibliographic Info

    as in new window
    This book is provided by The MIT Press in its series MIT Press Books with number 026208290x and published in 2001.

    Volume: 1
    Edition: 1
    ISBN: 0-262-08290-X
    Handle: RePEc:mtp:titles:026208290x

    Contact details of provider:
    Web page:

    Related research

    Keywords: data mining; algorithms; statistical models;

    Find related papers by JEL classification:


    No references listed on IDEAS
    You can help add them by filling out this form.


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as in new window

    Cited by:
    1. Malliaris, A.G. & Malliaris, Mary, 2011. "Are foreign currency markets interdependent? evidence from data mining technologies," MPRA Paper 35261, University Library of Munich, Germany.
    2. M. Almiñana & L. Escudero & A. Pérez-Martín & A. Rabasa & L. Santamaría, 2014. "A classification rule reduction algorithm based on significance domains," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer, vol. 22(1), pages 397-418, April.
    3. Christmann, Andreas & Steinwart, Ingo & Hubert, Mia, 2007. "Robust learning from bites for data mining," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 347-361, September.
    4. Robert Till & David Hand, 2003. "Behavioural models of credit card usage," Journal of Applied Statistics, Taylor & Francis Journals, vol. 30(10), pages 1201-1220.
    5. Doumpos, Michael & Zopounidis, Constantin, 2011. "Preference disaggregation and statistical learning for multicriteria decision support: A review," European Journal of Operational Research, Elsevier, vol. 209(3), pages 203-214, March.
    6. Wang, Wenjun & Liu, Dong & Liu, Xiao & Pan, Lin, 2013. "Fuzzy overlapping community detection based on local random walk and multidimensional scaling," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(24), pages 6578-6586.
    7. Adrien Jamain & David Hand, 2008. "Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation," Journal of Classification, Springer, vol. 25(1), pages 87-112, June.
    8. Carrizosa, Emilio & Martín-Barragán, Belén & Morales, Dolores Romero, 2011. "Detecting relevant variables and interactions in supervised classification," European Journal of Operational Research, Elsevier, vol. 213(1), pages 260-269, August.
    9. Ladias, Christos & Hasanagas, Nikolaos & Papadopoulou, Eleni, 2011. "Conceptualising ‘macro-regions’: Viewpoints and tools beyond NUTS classification," Studies in Agricultural Economics, Research Institute for Agricultural Economics, vol. 113(2), October.
    10. Adrian Costea, 2011. "Assessing The Performance Of Non-Banking Financial Institutions – A Knowledge Discovery Approach," Annals of University of Craiova - Economic Sciences Series, University of Craiova, Faculty of Economics and Business Administration, vol. 3(39), pages 174-185.
    11. Hand, David J., 2009. "Mining the past to determine the future: Problems and possibilities," International Journal of Forecasting, Elsevier, vol. 25(3), pages 441-451, July.
    12. Fang, Jiali & Jacobsen, Ben & Qin, Yafeng, 2014. "Predictability of the simple technical trading rules: An out-of-sample test," Review of Financial Economics, Elsevier, vol. 23(1), pages 30-45.
    13. Loebbecke, Claudia & Huyskens, Claudio, 2009. "Development of a model-based netsourcing decision support system using a five-stage methodology," European Journal of Operational Research, Elsevier, vol. 195(3), pages 653-661, June.
    14. Bőgel, György, 2011. "Az adatrobbanás mint közgazdasági jelenség
      [The data explosion as an economic phenomenon]
      ," Közgazdasági Szemle (Economic Review - monthly of the Hungarian Academy of Sciences), Közgazdasági Szemle Alapítvány (Economic Review Foundation), vol. 0(10), pages 877-889.
    15. Fernandez del Pozo, J. A. & Bielza, C. & Gomez, M., 2005. "A list-based compact representation for large decision tables management," European Journal of Operational Research, Elsevier, vol. 160(3), pages 638-662, February.


    This item is not listed on Wikipedia, on a reading list or among the top items on IDEAS.


    Access and download statistics


    When requesting a correction, please mention this item's handle: RePEc:mtp:titles:026208290x. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Jake Furbush).

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If references are entirely missing, you can add them using this form.

    If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.