IDEAS home Printed from https://ideas.repec.org/a/inm/ormnsc/v50y2004i7p967-982.html
   My bibliography  Save this article

Assessing Data Quality for Information Products: Impact of Selection, Projection, and Cartesian Product

Author

Listed:
  • Amir Parssian

    (College of Business and Management, University of Illinois at Springfield, Springfield, Illinois 62703)

  • Sumit Sarkar

    (School of Management, University of Texas at Dallas, Richardson, Texas 75080)

  • Varghese S. Jacob

    (School of Management, University of Texas at Dallas, Richardson, Texas 75080)

Abstract

The cost associated with making decisions based on poor-quality data is quite high. Consequently, the management of data quality and the quality of associated data management processes has become critical for organizations. An important first step in managing data quality is the ability to measure the quality of information products (derived data) based on the quality of the source data and associated processes used to produce the information outputs. We present a methodology to determine two data quality characteristicsÔaccuracy and completenessÔthat are of critical importance to decision makers. We examine how the quality metrics of source data affect the quality for information outputs produced using the relational algebra operations selection, projection, and Cartesian product. Our methodology is general, and can be used to determine how quality characteristics associated with diverse data sources affect the quality of the derived data.

Suggested Citation

  • Amir Parssian & Sumit Sarkar & Varghese S. Jacob, 2004. "Assessing Data Quality for Information Products: Impact of Selection, Projection, and Cartesian Product," Management Science, INFORMS, vol. 50(7), pages 967-982, July.
  • Handle: RePEc:inm:ormnsc:v:50:y:2004:i:7:p:967-982
    DOI: 10.1287/mnsc.1040.0237
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/mnsc.1040.0237
    Download Restriction: no

    File URL: https://libkey.io/10.1287/mnsc.1040.0237?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Kon, Henry B. & Madnick, Stuart E. & Siegel, Michael D., 1995. "Good answers from bad data : a data management strategy," Working papers 3868-95., Massachusetts Institute of Technology (MIT), Sloan School of Management.
    2. Donald Ballou & Richard Wang & Harold Pazer & Giri Kumar Tayi, 1998. "Modeling Information Manufacturing Systems to Determine Information Product Quality," Management Science, INFORMS, vol. 44(4), pages 462-484, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Amir Parssian & Sumit Sarkar & Varghese S. Jacob, 2009. "Impact of the Union and Difference Operations on the Quality of Information Products," Information Systems Research, INFORMS, vol. 20(1), pages 99-120, March.
    2. Qi Liu & Gengzhong Feng & Nengmin Wang & Giri Kumar Tayi, 2018. "A multi-objective model for discovering high-quality knowledge based on data quality and prior knowledge," Information Systems Frontiers, Springer, vol. 20(2), pages 401-416, April.
    3. Hazen, Benjamin T. & Boone, Christopher A. & Ezell, Jeremy D. & Jones-Farmer, L. Allison, 2014. "Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications," International Journal of Production Economics, Elsevier, vol. 154(C), pages 72-80.
    4. Dominikus Kleindienst, 2017. "The data quality improvement plan: deciding on choice and sequence of data quality improvements," Electronic Markets, Springer;IIM University of St. Gallen, vol. 27(4), pages 387-398, November.
    5. Qi Liu & Gengzhong Feng & Giri Kumar Tayi & Jun Tian, 2021. "Managing Data Quality of the Data Warehouse: A Chance-Constrained Programming Approach," Information Systems Frontiers, Springer, vol. 23(2), pages 375-389, April.
    6. Debabrata Dey & Subodha Kumar, 2013. "Data Quality of Query Results with Generalized Selection Conditions," Operations Research, INFORMS, vol. 61(1), pages 17-31, February.
    7. Debabrata Dey & Subodha Kumar, 2010. "Reassessing Data Quality for Information Products," Management Science, INFORMS, vol. 56(12), pages 2316-2322, December.
    8. Qi Liu & Gengzhong Feng & Nengmin Wang & Giri Kumar Tayi, 0. "A multi-objective model for discovering high-quality knowledge based on data quality and prior knowledge," Information Systems Frontiers, Springer, vol. 0, pages 1-16.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Amir Parssian & Sumit Sarkar & Varghese S. Jacob, 2009. "Impact of the Union and Difference Operations on the Quality of Information Products," Information Systems Research, INFORMS, vol. 20(1), pages 99-120, March.
    2. Debabrata Dey & Subodha Kumar, 2013. "Data Quality of Query Results with Generalized Selection Conditions," Operations Research, INFORMS, vol. 61(1), pages 17-31, February.
    3. Xitong Li & Hongwei Zhu & Luo Zuo, 2021. "Reporting Technologies and Textual Readability: Evidence from the XBRL Mandate," Information Systems Research, INFORMS, vol. 32(3), pages 1025-1042, September.
    4. Juha-Miikka Nurmilaakso, 2014. "Coordination costs and ICT investments: an economic analysis," Netnomics, Springer, vol. 15(2), pages 57-67, September.
    5. Xiao, Yu & Lu, Louis Y.Y. & Liu, John S. & Zhou, Zhili, 2014. "Knowledge diffusion path analysis of data quality literature: A main path analysis," Journal of Informetrics, Elsevier, vol. 8(3), pages 594-605.
    6. Davidson, Ian & Tayi, Giri, 2009. "Data preparation using data quality matrices for classification mining," European Journal of Operational Research, Elsevier, vol. 197(2), pages 764-772, September.
    7. Even, Adir & Shankaranarayanan, G. & Berger, Paul D., 2010. "Managing the Quality of Marketing Data: Cost/benefit Tradeoffs and Optimal Configuration," Journal of Interactive Marketing, Elsevier, vol. 24(3), pages 209-221.
    8. Paul Glowalla & Ali Sunyaev, 2013. "Process-Driven Data Quality Management Through Integration of Data Quality into Existing Process Models," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 5(6), pages 433-448, December.
    9. Klein, B. D. & Rossin, D. F., 1999. "Data quality in neural network models: effect of error rate and magnitude of error on predictive accuracy," Omega, Elsevier, vol. 27(5), pages 569-582, October.
    10. Bonney, Maurice & Jaber, Mohamad Y., 2013. "Developing an input–output activity matrix (IOAM) for environmental and economic analysis of manufacturing systems and logistics chains," International Journal of Production Economics, Elsevier, vol. 143(2), pages 589-597.
    11. Rajiv D. Banker & Robert J. Kauffman, 2004. "50th Anniversary Article: The Evolution of Research on Information Systems: A Fiftieth-Year Survey of the Literature in Management Science," Management Science, INFORMS, vol. 50(3), pages 281-298, March.
    12. Debabrata Dey & Subodha Kumar, 2010. "Reassessing Data Quality for Information Products," Management Science, INFORMS, vol. 56(12), pages 2316-2322, December.
    13. Maria Grazia Fugini & Barbara Pernici & Filippo Ramoni, 2009. "Quality analysis of composed services through fault injection," Information Systems Frontiers, Springer, vol. 11(3), pages 227-239, July.
    14. André Marie Mbakop & Joseph Voufo & Florent Biyeme & Louise Angèle Ngozag & Lucien Meva’a, 2021. "Analysis of Information Flow Characteristics in Shop Floor: State-of-the-Art and Future Research Directions for Developing Countries," Global Journal of Flexible Systems Management, Springer;Global Institute of Flexible Systems Management, vol. 22(1), pages 43-53, March.
    15. Park, JungKun & Chung, HoEun & Yoo, Weon Sang, 2009. "Is the Internet a primary source for consumer information search?: Group comparison for channel choices," Journal of Retailing and Consumer Services, Elsevier, vol. 16(2), pages 92-99.
    16. Hazen, Benjamin T. & Weigel, Fred K. & Ezell, Jeremy D. & Boehmke, Bradley C. & Bradley, Randy V., 2017. "Toward understanding outcomes associated with data quality improvement," International Journal of Production Economics, Elsevier, vol. 193(C), pages 737-747.
    17. Hazen, Benjamin T. & Boone, Christopher A. & Ezell, Jeremy D. & Jones-Farmer, L. Allison, 2014. "Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications," International Journal of Production Economics, Elsevier, vol. 154(C), pages 72-80.
    18. Xue Bai, 2012. "A Mathematical Framework for Data Quality Management in Enterprise Systems," INFORMS Journal on Computing, INFORMS, vol. 24(4), pages 648-664, November.
    19. Melchor Medina José & Lavín Verástegui Jesús & Pedraza Melo Norma Angélica, 2012. "Seguridad en la administración y calidad de los datos de un sistema de información contable en el desempeño organizacional," Contaduría y Administración, Accounting and Management, vol. 57(4), pages 11-34, octubre-d.
    20. Dominikus Kleindienst, 2017. "The data quality improvement plan: deciding on choice and sequence of data quality improvements," Electronic Markets, Springer;IIM University of St. Gallen, vol. 27(4), pages 387-398, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:50:y:2004:i:7:p:967-982. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.