IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v56y2012i8p2404-2409.html
   My bibliography  Save this article

On quantile quantile plots for generalized linear models

Author

Listed:
  • Augustin, Nicole H.
  • Sauleau, Erik-André
  • Wood, Simon N.

Abstract

The distributional assumption for a generalized linear model is often checked by plotting the ordered deviance residuals against the quantiles of a standard normal distribution. Such plots can be difficult to interpret, because even when the model is correct, the plot often deviates substantially from a straight line. To rectify this problem Ben and Yohai (2004) proposed plotting the deviance residuals against their theoretical quantiles, under the assumption that the model is correct. Such plots are closer to a straight line, when the model is correct, making them much more useful for model checking. However the quantile computation proposed in Ben and Yohai is, in general, relatively complicated to implement and computationally expensive, so that general purpose software for these plots is only available for the Poisson and binary cases in the R package robust. As an alternative the theoretical quantiles can efficiently and simply be estimated by repeatedly simulating new response data from the fitted model and computing the corresponding residuals. This method also provides reference bands for judging the significance of departures of QQ-plots from ideal straight line form. A second alternative is to estimate the quantiles using quantiles of the response variable distribution according to the estimated model. This latter alternative generally has lower computational cost than the first, but does not yield QQ-plot reference bands. In simulations the quantiles produced by the new methods give results indistinguishable from the original Ben and Yohai quantile computations, but the scaling of computational cost with sample size is much improved so that a 500 fold reduction in computation time was observed at sample size 50,000. Application of the methods to generalized linear models fitted to prostate cancer incidence data suggest that they are particularly useful in large dataset cases that might otherwise be incorrectly viewed as zero-inflated. The new approaches are simple enough to implement for any exponential family distribution and for several alternative types of residual, and this has been done for all the families available for use with generalized linear models in the basic distribution of R.

Suggested Citation

  • Augustin, Nicole H. & Sauleau, Erik-André & Wood, Simon N., 2012. "On quantile quantile plots for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 56(8), pages 2404-2409.
  • Handle: RePEc:eee:csdana:v:56:y:2012:i:8:p:2404-2409
    DOI: 10.1016/j.csda.2012.01.026
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947312000692
    Download Restriction: Full text for ScienceDirect subscribers only.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chen, Xue-Dong & Fu, Ying-Zi, 2011. "Model selection for zero-inflated regression with missing covariates," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 765-773, January.
    2. Garay, Aldo M. & Hashimoto, Elizabeth M. & Ortega, Edwin M.M. & Lachos, Víctor H., 2011. "On estimation and influence diagnostics for zero-inflated negative binomial regression models," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1304-1318, March.
    Full references (including those not matched with items on IDEAS)

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:56:y:2012:i:8:p:2404-2409. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: http://www.elsevier.com/locate/csda .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.