IDEAS home Printed from https://ideas.repec.org/a/inm/ormnsc/v55y2009i4p664-684.html
   My bibliography  Save this article

Active Feature-Value Acquisition

Author

Listed:
  • Maytal Saar-Tsechansky

    (McCombs School of Business, University of Texas at Austin, Austin, Texas 78712)

  • Prem Melville

    (IBM T.J. Watson Research Center, Yorktown Heights, New York 10598)

  • Foster Provost

    (Stern School of Business, New York University, New York, New York 10012)

Abstract

Most induction algorithms for building predictive models take as input training data in the form of feature vectors. Acquiring the values of features may be costly, and simply acquiring all values may be wasteful or prohibitively expensive. Active feature-value acquisition (AFA) selects features incrementally in an attempt to improve the predictive model most cost-effectively. This paper presents a framework for AFA based on estimating information value. Although straightforward in principle, estimations and approximations must be made to apply the framework in practice. We present an acquisition policy, sampled expected utility (SEU), that employs particular estimations to enable effective ranking of potential acquisitions in settings where relatively little information is available about the underlying domain. We then present experimental results showing that, compared with the policy of using representative sampling for feature acquisition, SEU reduces the cost of producing a model of a desired accuracy and exhibits consistent performance across domains. We also extend the framework to a more general modeling setting in which feature values as well as class labels are missing and are costly to acquire.

Suggested Citation

  • Maytal Saar-Tsechansky & Prem Melville & Foster Provost, 2009. "Active Feature-Value Acquisition," Management Science, INFORMS, vol. 55(4), pages 664-684, April.
  • Handle: RePEc:inm:ormnsc:v:55:y:2009:i:4:p:664-684
    DOI: 10.1287/mnsc.1080.0952
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/mnsc.1080.0952
    Download Restriction: no

    File URL: https://libkey.io/10.1287/mnsc.1080.0952?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Kevin F. McCardle, 1985. "Information Acquisition and the Adoption of New Technology," Management Science, INFORMS, vol. 31(11), pages 1372-1389, November.
    2. Michael V. Mannino & Vijay S. Mookerjee, 1999. "Optimizing Expert Systems: Heuristics for Efficiently Generating Low-Cost Information Acquisition Strategies," INFORMS Journal on Computing, INFORMS, vol. 11(3), pages 278-291, August.
    3. Zhiqiang Zheng & Balaji Padmanabhan, 2006. "Selectively Acquiring Customer Information: A New Data Acquisition Problem and an Active Learning-Based Solution," Management Science, INFORMS, vol. 52(5), pages 697-712, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Meghana Deodhar & Joydeep Ghosh & Maytal Saar-Tsechansky & Vineet Keshari, 2017. "Active Learning with Multiple Localized Regression Models," INFORMS Journal on Computing, INFORMS, vol. 29(3), pages 503-522, August.
    2. Zhepeng Li & Xiao Fang & Xue Bai & Olivia R. Liu Sheng, 2017. "Utility-Based Link Recommendation for Online Social Networks," Management Science, INFORMS, vol. 63(6), pages 1938-1952, June.
    3. Shantanu Gupta & Zachary C. Lipton & David Childers, 2021. "Efficient Online Estimation of Causal Effects by Deciding What to Observe," Papers 2108.09265, arXiv.org, revised Oct 2021.
    4. Xinyi Zhang & Chenshuo Sun & Renyu Zhang & Khim-Yong Goh, 2024. "The Value of AI-Generated Metadata for UGC Platforms: Evidence from a Large-scale Field Experiment," Papers 2412.18337, arXiv.org.
    5. Kaiquan Xu & Stephen Shaoyi Liao & Raymond Y. K. Lau & J. Leon Zhao, 2014. "Effective Active Learning Strategies for the Use of Large-Margin Classifiers in Semantic Annotation: An Optimal Parameter Discovery Perspective," INFORMS Journal on Computing, INFORMS, vol. 26(3), pages 461-483, August.
    6. Xiaoping Liu & Xiao-Bai Li & Sumit Sarkar, 2023. "Cost-Restricted Feature Selection for Data Acquisition," Management Science, INFORMS, vol. 69(7), pages 3976-3992, July.
    7. Xuan Bi & Mochen Yang & Gediminas Adomavicius, 2024. "Consumer Acquisition for Recommender Systems: A Theoretical Framework and Empirical Evaluations," Information Systems Research, INFORMS, vol. 35(1), pages 339-362, March.
    8. repec:wbk:wbrwps:10255 is not listed on IDEAS
    9. Hung-Pin Kao & Kwei Tang, 2014. "Cost-Sensitive Decision Tree Induction with Label-Dependent Late Constraints," INFORMS Journal on Computing, INFORMS, vol. 26(2), pages 238-252, May.
    10. Jing Wang & Panagiotis G. Ipeirotis & Foster Provost, 2017. "Cost-Effective Quality Assurance in Crowd Labeling," Information Systems Research, INFORMS, vol. 28(1), pages 137-158, March.
    11. Fan Zhou & Kunpeng Zhang & Bangying Wu & Yi Yang & Harry Jiannan Wang, 2021. "Unifying Online and Offline Preference for Social Link Prediction," INFORMS Journal on Computing, INFORMS, vol. 33(4), pages 1400-1418, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaoping Liu & Xiao-Bai Li, 2024. "Cost-Effective Acquisition of First-Party Data for Business Analytics," INFORMS Journal on Computing, INFORMS, vol. 36(5), pages 1242-1260, September.
    2. Kathy A. Paulson Gjerde & Susan A. Slotnick & Matthew J. Sobel, 2002. "New Product Innovation with Multiple Features and Technology Constraints," Management Science, INFORMS, vol. 48(10), pages 1268-1284, October.
    3. Zhiyuan Wang & Zhiqiang (Eric) Zheng & Wei Jiang & Shaojie Tang, 2021. "Blockchain‐Enabled Data Sharing in Supply Chains: Model, Operationalization, and Tutorial," Production and Operations Management, Production and Operations Management Society, vol. 30(7), pages 1965-1985, July.
    4. Churlzu Lim & J. Neil Bearden & J. Cole Smith, 2006. "Sequential Search with Multiattribute Options," Decision Analysis, INFORMS, vol. 3(1), pages 3-15, March.
    5. H. Dharma Kwon & Wenxin Xu & Anupam Agrawal & Suresh Muthulingam, 2016. "Impact of Bayesian Learning and Externalities on Strategic Investment," Management Science, INFORMS, vol. 62(2), pages 550-570, February.
    6. Tetsuya Kasahara, 2015. "Strategic Technology Adoption Under Dispersed Information and Information Learning," International Journal of Innovation and Technology Management (IJITM), World Scientific Publishing Co. Pte. Ltd., vol. 12(06), pages 1-18, December.
    7. Saša Zorc & Ilia Tsetlin, 2020. "Deadlines, Offer Timing, and the Search for Alternatives," Operations Research, INFORMS, vol. 68(3), pages 927-948, May.
    8. J. G. Smythe, 2002. "Reputation, public information, and physician adoption of an innovation," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 3(2), pages 103-110, June.
    9. N. Bora Keskin & John R. Birge, 2019. "Dynamic Selling Mechanisms for Product Differentiation and Learning," Operations Research, INFORMS, vol. 67(4), pages 1069-1089, July.
    10. Khanh T.P. Nguyen & Thomas Yeung & Bruno Castanier, 2017. "Acquisition of new technology information for maintenance and replacement policies," International Journal of Production Research, Taylor & Francis Journals, vol. 55(8), pages 2212-2231, April.
    11. Hagspiel, Verena & Huisman, Kuno J.M. & Kort, Peter M. & Lavrutich, Maria N. & Nunes, Cláudia & Pimentel, Rita, 2020. "Technology adoption in a declining market," European Journal of Operational Research, Elsevier, vol. 285(1), pages 380-392.
    12. Elie Ofek & Muhamet Yildiz & Ernan Haruvy, 2007. "The Impact of Prior Decisions on Subsequent Valuations in a Costly Contemplation Model," Management Science, INFORMS, vol. 53(8), pages 1217-1233, August.
    13. Klemen Knez, 2023. "Technology diffusion and uneven development," Journal of Evolutionary Economics, Springer, vol. 33(4), pages 1171-1195, September.
    14. Jafarizadeh, Babak, 2012. "Information acquisition as an American option," Energy Economics, Elsevier, vol. 34(3), pages 807-816.
    15. Ana María Sánchez Pérez & Jorge Tarifa Fernández & Salvador Cruz Rambaud, 2020. "Assessing Blockchain Investments through the Learning Option: An Application to the Automotive and Aerospace Industry," Mathematics, MDPI, vol. 8(12), pages 1-13, December.
    16. Hagspiel, V., 2011. "Flexibility in technology choice : A real options approach," Other publications TiSEM 4150e2d4-6ca2-4367-a8b9-2, Tilburg University, School of Economics and Management.
    17. Mariotti, Thomas & Décamps, Jean-Paul & Gensbittel, Fabien, 2021. "Investment Timing and Technological Breakthrough," CEPR Discussion Papers 16246, C.E.P.R. Discussion Papers.
    18. Yingfei Wang & Inbal Yahav & Balaji Padmanabhan, 2024. "Smart Testing with Vaccination: A Bandit Algorithm for Active Sampling for Managing COVID-19," Information Systems Research, INFORMS, vol. 35(1), pages 120-144, March.
    19. Shivam Gupta & Anupam Agrawal & Jennifer K. Ryan, 2023. "Agile contracting: Managing incentives under uncertain needs," Production and Operations Management, Production and Operations Management Society, vol. 32(3), pages 972-988, March.
    20. Nur Sunar & John R. Birge & Sinit Vitavasiri, 2019. "Optimal Dynamic Product Development and Launch for a Network of Customers," Operations Research, INFORMS, vol. 67(3), pages 770-790, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:55:y:2009:i:4:p:664-684. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.