IDEAS home Printed from https://ideas.repec.org/p/tin/wpaper/20140089.html
   My bibliography  Save this paper

Regularized Regression Incorporating Network Information: Simultaneous Estimation of Covariate Coefficients and Connection Signs

Author

Listed:
  • Matthias Weber

    (University of Amsterdam, the Netherlands)

  • Martin Schumacher

    (University Medical Center, Freiburg)

  • Harald Binder

    (University Medical Center, Mainz, Germany)

Abstract

We develop an algorithm that incorporates network information into regression settings. It simultaneously estimates the covariate coefficients and the signs of the network connections (i.e. whether the connections are of an activating or of a repressing type). For the coefficient estimation steps an additional penalty is set on top of the lasso penalty, similarly to Li and Li (2008). We develop a fast implementation for the new method based on coordinate descent. Furthermore, we show how the new methods can be applied to time-to-event data. The new method yields good results in simulation studies concerning sensitivity and specificity of non-zero covariate coefficients, estimation of network connection signs, and prediction performance. We also apply the new method to two microarray time-to-event data sets from patients with ovarian cancer and diffuse large B-cell lymphoma. The new method performs very well in both cases. The main application of this new method is of biomedical nature, but it may also be useful in other fields where network data is available.

Suggested Citation

  • Matthias Weber & Martin Schumacher & Harald Binder, 2014. "Regularized Regression Incorporating Network Information: Simultaneous Estimation of Covariate Coefficients and Connection Signs," Tinbergen Institute Discussion Papers 14-089/I, Tinbergen Institute.
  • Handle: RePEc:tin:wpaper:20140089
    as

    Download full text from publisher

    File URL: https://papers.tinbergen.nl/14089.pdf
    Download Restriction: no

    References listed on IDEAS

    as
    1. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    2. Daniela M. Witten & Robert Tibshirani, 2009. "Covarianceā€regularized regression and classification for high dimensional problems," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 615-636, June.
    3. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    4. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    5. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    6. Thomas A. Gerds & Martin Schumacher, 2007. "Efron-Type Measures of Prediction Error for Survival Analysis," Biometrics, The International Biometric Society, vol. 63(4), pages 1283-1287, December.
    7. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    More about this item

    Keywords

    high-dimensional data; gene expression data; pathway information; penalized regression;

    JEL classification:

    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • C41 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Duration Analysis; Optimal Timing Strategies
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tin:wpaper:20140089. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Tinbergen Office +31 (0)10-4088900). General contact details of provider: http://edirc.repec.org/data/tinbenl.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.