IDEAS home Printed from https://ideas.repec.org/a/spr/jglopt/v73y2019i2d10.1007_s10898-018-0713-3.html
   My bibliography  Save this article

Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor

Author

Listed:
  • Ryuta Tamura

    (Tokyo University of Agriculture and Technology
    October Sky Co., Ltd.)

  • Ken Kobayashi

    (Fujitsu Laboratories Ltd.)

  • Yuichi Takano

    (Senshu University
    University of Tsukuba)

  • Ryuhei Miyashiro

    (Tokyo University of Agriculture and Technology)

  • Kazuhide Nakata

    (Tokyo Institute of Technology)

  • Tomomi Matsui

    (Tokyo Institute of Technology)

Abstract

Multicollinearity exists when some explanatory variables of a multiple linear regression model are highly correlated. High correlation among explanatory variables reduces the reliability of the analysis. To eliminate multicollinearity from a linear regression model, we consider how to select a subset of significant variables by means of the variance inflation factor (VIF), which is the most common indicator used in detecting multicollinearity. In particular, we adopt the mixed integer optimization (MIO) approach to subset selection. The MIO approach was proposed in the 1970s, and recently it has received renewed attention due to advances in algorithms and hardware. However, none of the existing studies have developed a computationally tractable MIO formulation for eliminating multicollinearity on the basis of VIF. In this paper, we propose mixed integer quadratic optimization (MIQO) formulations for selecting the best subset of explanatory variables subject to the upper bounds on the VIFs of selected variables. Our two MIQO formulations are based on the two equivalent definitions of VIF. Computational results illustrate the effectiveness of our MIQO formulations by comparison with conventional local search algorithms and MIO-based cutting plane algorithms.

Suggested Citation

  • Ryuta Tamura & Ken Kobayashi & Yuichi Takano & Ryuhei Miyashiro & Kazuhide Nakata & Tomomi Matsui, 2019. "Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor," Journal of Global Optimization, Springer, vol. 73(2), pages 431-446, February.
  • Handle: RePEc:spr:jglopt:v:73:y:2019:i:2:d:10.1007_s10898-018-0713-3
    DOI: 10.1007/s10898-018-0713-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10898-018-0713-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10898-018-0713-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Toshiki Sato & Yuichi Takano & Ryuhei Miyashiro & Akiko Yoshise, 2016. "Feature subset selection for logistic regression via mixed integer optimization," Computational Optimization and Applications, Springer, vol. 64(3), pages 865-880, July.
    2. Ian T. Jolliffe, 1982. "A Note on the Use of Principal Components in Regression," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 31(3), pages 300-303, November.
    3. Dimitris Bertsimas & Angela King, 2016. "OR Forum—An Algorithmic Approach to Linear Regression," Operations Research, INFORMS, vol. 64(1), pages 2-16, February.
    4. Miyashiro, Ryuhei & Takano, Yuichi, 2015. "Mixed integer second-order cone programming formulations for variable selection in linear regression," European Journal of Operational Research, Elsevier, vol. 247(3), pages 721-731.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chengjie Yang & Ruren Li & Zongyao Sha, 2020. "Exploring the Dynamics of Urban Greenness Space and Their Driving Factors Using Geographically Weighted Regression: A Case Study in Wuhan Metropolis, China," Land, MDPI, vol. 9(12), pages 1-21, December.
    2. Yuichi Takano & Ryuhei Miyashiro, 2020. "Best subset selection via cross-validation criterion," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(2), pages 475-488, July.
    3. Iuliia Iliashenko & Fragkoulis Papagiannis & Patrizia Gazzola & Nataliia Cherkas & Daniele Grechi, 2023. "Entrepreneurial Behaviour and Organisational Propensity to Innovate in a Public-Sector Context," Journal of Entrepreneurship and Innovation in Emerging Economies, Entrepreneurship Development Institute of India, vol. 32(1), pages 111-156, March.
    4. Ken Kobayashi & Yuichi Takano & Kazuhide Nakata, 2021. "Bilevel cutting-plane algorithm for cardinality-constrained mean-CVaR portfolio optimization," Journal of Global Optimization, Springer, vol. 81(2), pages 493-528, October.
    5. Pankaj Tiwari, 2023. "Influence of Millennials’ eco-literacy and biospheric values on green purchases: the mediating effect of attitude," Public Organization Review, Springer, vol. 23(3), pages 1195-1212, September.
    6. Tomokaze Shiratori & Ken Kobayashi & Yuichi Takano, 2020. "Prediction of hierarchical time series using structured regularization and its application to artificial neural networks," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-23, November.
    7. Wu, Hong, 2023. "Evaluating the role of renewable energy investment resources and green finance on the economic performance: Evidence from OECD economies," Resources Policy, Elsevier, vol. 80(C).
    8. Jireh Yi-Le Chan & Steven Mun Hong Leow & Khean Thye Bea & Wai Khuen Cheng & Seuk Wai Phoong & Zeng-Wei Hong & Yen-Lin Chen, 2022. "Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review," Mathematics, MDPI, vol. 10(8), pages 1-17, April.
    9. Gambella, Claudio & Ghaddar, Bissan & Naoum-Sawaya, Joe, 2021. "Optimization problems for machine learning: A survey," European Journal of Operational Research, Elsevier, vol. 290(3), pages 807-828.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Leonardo Di Gangi & M. Lapucci & F. Schoen & A. Sortino, 2019. "An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series," Computational Optimization and Applications, Springer, vol. 74(3), pages 919-948, December.
    2. Young Woong Park & Diego Klabjan, 2020. "Subset selection for multiple linear regression via optimization," Journal of Global Optimization, Springer, vol. 77(3), pages 543-574, July.
    3. Matteo Lapucci & Tommaso Levato & Marco Sciandrone, 2021. "Convergent Inexact Penalty Decomposition Methods for Cardinality-Constrained Problems," Journal of Optimization Theory and Applications, Springer, vol. 188(2), pages 473-496, February.
    4. Tao Xu & He Meng & Jie Zhu & Wei Wei & He Zhao & Han Yang & Zijin Li & Yuhan Wu, 2021. "Optimal Capacity Allocation of Energy Storage in Distribution Networks Considering Active/Reactive Coordination," Energies, MDPI, vol. 14(6), pages 1-24, March.
    5. Minjung Kyung & Ju-Hyun Park & Ji Yeh Choi, 2022. "Bayesian Mixture Model of Extended Redundancy Analysis," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 946-966, September.
    6. Hugh L. Christensen, 2015. "Algorithmic arbitrage of open-end funds using variational Bayes," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 2(04), pages 1-38, December.
    7. Jiaju Miao & Pawel Polak, 2023. "Online Ensemble of Models for Optimal Predictive Performance with Applications to Sector Rotation Strategy," Papers 2304.09947, arXiv.org.
    8. Mirza Pasic & Halima Hadziahmetovic & Ismira Ahmovic & Mugdim Pasic, 2023. "Principal Component Regression Modeling and Analysis of PM 10 and Meteorological Parameters in Sarajevo with and without Temperature Inversion," Sustainability, MDPI, vol. 15(14), pages 1-22, July.
    9. Elkin Castaño & Santiago Gallón, 2017. "A solution for multicollinearity in stochastic frontier production function models," Lecturas de Economía, Universidad de Antioquia, Departamento de Economía, issue 86, pages 9-23, Enero - J.
    10. Ranjith Vijayakumar & Ji Yeh Choi & Eun Hwa Jung, 2022. "A Unified Neural Network Framework for Extended Redundancy Analysis," Psychometrika, Springer;The Psychometric Society, vol. 87(4), pages 1503-1528, December.
    11. Matsypura, Dmytro & Thompson, Ryan & Vasnev, Andrey L., 2018. "Optimal selection of expert forecasts with integer programming," Omega, Elsevier, vol. 78(C), pages 165-175.
    12. Anish Agarwal & Keegan Harris & Justin Whitehouse & Zhiwei Steven Wu, 2023. "Adaptive Principal Component Regression with Applications to Panel Data," Papers 2307.01357, arXiv.org, revised Oct 2023.
    13. Santiago Velásquez & Juho Kanniainen & Saku Mäkinen & Jaakko Valli, 2018. "Layoff announcements and intra-day market reactions," Review of Managerial Science, Springer, vol. 12(1), pages 203-228, January.
    14. Sandip Garai & Ranjit Kumar Paul & Debopam Rakshit & Md Yeasin & Walid Emam & Yusra Tashkandy & Christophe Chesneau, 2023. "Wavelets in Combination with Stochastic and Machine Learning Models to Predict Agricultural Prices," Mathematics, MDPI, vol. 11(13), pages 1-18, June.
    15. Ben-Ameur, Walid & Neto, José, 2022. "New bounds for subset selection from conic relaxations," European Journal of Operational Research, Elsevier, vol. 298(2), pages 425-438.
    16. Luis A. Barboza & Julien Emile-Geay & Bo Li & Wan He, 2019. "Efficient Reconstructions of Common Era Climate via Integrated Nested Laplace Approximations," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 24(3), pages 535-554, September.
    17. Manlio Gaudioso & Giovanni Giallombardo & Giovanna Miglionico, 2023. "Sparse optimization via vector k-norm and DC programming with an application to feature selection for support vector machines," Computational Optimization and Applications, Springer, vol. 86(2), pages 745-766, November.
    18. Benítez-Peña, Sandra & Bogetoft, Peter & Romero Morales, Dolores, 2020. "Feature Selection in Data Envelopment Analysis: A Mathematical Optimization approach," Omega, Elsevier, vol. 96(C).
    19. Israel R. Orimoloye & Adeyemi O. Olusola & Johanes A. Belle & Chaitanya B. Pande & Olusola O. Ololade, 2022. "Drought disaster monitoring and land use dynamics: identification of drought drivers using regression-based algorithms," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 112(2), pages 1085-1106, June.
    20. Tanin Sirimongkolkasem & Reza Drikvandi, 2019. "On Regularisation Methods for Analysis of High Dimensional Data," Annals of Data Science, Springer, vol. 6(4), pages 737-763, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jglopt:v:73:y:2019:i:2:d:10.1007_s10898-018-0713-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.