IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v34y2022i5p2485-2501.html
   My bibliography  Save this article

Features Selection as a Nash-Bargaining Solution: Applications in Online Advertising and Information Systems

Author

Listed:
  • Kimia Keshanian

    (Information and Technology Management Department, University of Tampa, Tampa, Florida 33606)

  • Daniel Zantedeschi

    (School of Information Systems and Management, University of South Florida, Tampa, Florida 33620)

  • Kaushik Dutta

    (School of Information Systems and Management, University of South Florida, Tampa, Florida 33620)

Abstract

Feature selection is a fundamental problem in online advertising, as features usually need to be purchased from third parties, and they are costly. Although many feature selection techniques can be used in online advertising and the general information systems (IS) domain, their performance is often context specific. Therefore, the literature of IS is suffering from a lack of adequate and generic methods. In this study, we address this issue by proposing a novel approach that employs ideas from the field of cooperative game theory. We derive a (continuous) second-order cone program that any convex programming solver can solve for determining the best subset of features. We show the efficacy of our proposed method on a real-life online advertising case study. We demonstrate that our proposed approach performs better in accuracy, precision, recall, and F-1 score than the best of the other approaches with much fewer features. Also, to illustrate that our method’s benefits are not limited to the context of online advertising, we perform an extensive set of simulations and consider a well-established real-life data set drawn from the UCI Machine Learning Repository at the University of California, Irvine. Summary of Contribution: Selecting the best subset of features is an important problem in the context of online advertising and, more broadly, in the field of information systems because firms usually need to buy costly data to model and forecast economic outcomes. In this study, we propose a novel methodology for addressing this problem. The proposed method employs the concept of the Nash bargaining solution in cooperative game theory to create a good balance between maximizing the fit while minimizing the noise when selecting the best subset of features. We apply the method to a real-life online advertising case study, providing superior performance in predicting and interpreting the features. Moreover, we show that the proposed method applies to a broader range of feature selection problems. We conduct a comprehensive computational study on simulated regression data sets and other real-life classification data sets widely available in the machine learning domain. The result of these efforts indicates that our method is robust in terms of prediction accuracy by outperforming several state-of-the-art techniques.

Suggested Citation

  • Kimia Keshanian & Daniel Zantedeschi & Kaushik Dutta, 2022. "Features Selection as a Nash-Bargaining Solution: Applications in Online Advertising and Information Systems," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2485-2501, September.
  • Handle: RePEc:inm:orijoc:v:34:y:2022:i:5:p:2485-2501
    DOI: 10.1287/ijoc.2022.1190
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/ijoc.2022.1190
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2022.1190?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    2. Yanwu Yang & Daniel Zeng & Yinghui Yang & Jie Zhang, 2015. "Optimal Budget Allocation Across Search Advertising Markets," INFORMS Journal on Computing, INFORMS, vol. 27(2), pages 285-300, May.
    3. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    4. Hamsa Bastani & Mohsen Bayati, 2020. "Online Decision Making with High-Dimensional Covariates," Operations Research, INFORMS, vol. 68(1), pages 276-294, January.
    5. Shapley, L. S. & Shubik, Martin, 1954. "A Method for Evaluating the Distribution of Power in a Committee System," American Political Science Review, Cambridge University Press, vol. 48(3), pages 787-792, September.
    6. Miyashiro, Ryuhei & Takano, Yuichi, 2015. "Mixed integer second-order cone programming formulations for variable selection in linear regression," European Journal of Operational Research, Elsevier, vol. 247(3), pages 721-731.
    7. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    8. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    9. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    10. Loann D. Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," AMSE Working Papers 1852, Aix-Marseille School of Economics, France.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aneiros, Germán & Novo, Silvia & Vieu, Philippe, 2022. "Variable selection in functional regression models: A review," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    2. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    3. Fakhri J. Hasanov & Muhammad Javid & Frederick L. Joutz, 2022. "Saudi Non-Oil Exports before and after COVID-19: Historical Impacts of Determinants and Scenario Analysis," Sustainability, MDPI, vol. 14(4), pages 1-38, February.
    4. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    5. Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
    6. Zeng, Yaohui & Yang, Tianbao & Breheny, Patrick, 2021. "Hybrid safe–strong rules for efficient optimization in lasso-type problems," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    7. Zakariya Yahya Algamal & Muhammad Hisyam Lee, 2019. "A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 753-771, September.
    8. Soyeon Kim & Veerabhadran Baladandayuthapani & J. Jack Lee, 2017. "Prediction-Oriented Marker Selection (PROMISE): With Application to High-Dimensional Regression," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 217-245, June.
    9. Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
    10. Armin Rauschenberger & Iuliana Ciocănea-Teodorescu & Marianne A. Jonker & Renée X. Menezes & Mark A. Wiel, 2020. "Sparse classification with paired covariates," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(3), pages 571-588, September.
    11. Bai, Ray & Ghosh, Malay, 2018. "High-dimensional multivariate posterior consistency under global–local shrinkage priors," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 157-170.
    12. Paweł Teisseyre & Robert A. Kłopotek & Jan Mielniczuk, 2016. "Random Subspace Method for high-dimensional regression with the R package regRSM," Computational Statistics, Springer, vol. 31(3), pages 943-972, September.
    13. Junyang Qian & Yosuke Tanigawa & Wenfei Du & Matthew Aguirre & Chris Chang & Robert Tibshirani & Manuel A Rivas & Trevor Hastie, 2020. "A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank," PLOS Genetics, Public Library of Science, vol. 16(10), pages 1-30, October.
    14. She, Yiyuan, 2012. "An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2976-2990.
    15. Fan, Jianqing & Ke, Yuan & Wang, Kaizheng, 2020. "Factor-adjusted regularized model selection," Journal of Econometrics, Elsevier, vol. 216(1), pages 71-85.
    16. Zhihua Sun & Yi Liu & Kani Chen & Gang Li, 2022. "Broken adaptive ridge regression for right-censored survival data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(1), pages 69-91, February.
    17. Gao Wang & Abhishek Sarkar & Peter Carbonetto & Matthew Stephens, 2020. "A simple new approach to variable selection in regression, with application to genetic fine mapping," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1273-1300, December.
    18. Jian Huang & Yuling Jiao & Lican Kang & Jin Liu & Yanyan Liu & Xiliang Lu, 2022. "GSDAR: a fast Newton algorithm for $$\ell _0$$ ℓ 0 regularized generalized linear models with statistical guarantee," Computational Statistics, Springer, vol. 37(1), pages 507-533, March.
    19. Abhijeet R Patil & Sangjin Kim, 2020. "Combination of Ensembles of Regularized Regression Models with Resampling-Based Lasso Feature Selection in High Dimensional Data," Mathematics, MDPI, vol. 8(1), pages 1-23, January.
    20. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:34:y:2022:i:5:p:2485-2501. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.