IDEAS home Printed from https://ideas.repec.org/a/taf/japsta/v36y2009i5p507-520.html
   My bibliography  Save this article

The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression

Author

Listed:
  • M. Habshah
  • M. R. Norazan
  • A.H.M. Rahmatullah Imon

Abstract

Leverage values are being used in regression diagnostics as measures of influential observations in the $X$-space. Detection of high leverage values is crucial because of their responsibility for misleading conclusion about the fitting of a regression model, causing multicollinearity problems, masking and/or swamping of outliers, etc. Much work has been done on the identification of single high leverage points and it is generally believed that the problem of detection of a single high leverage point has been largely resolved. But there is no general agreement among the statisticians about the detection of multiple high leverage points. When a group of high leverage points is present in a data set, mainly because of the masking and/or swamping effects the commonly used diagnostic methods fail to identify them correctly. On the other hand, the robust alternative methods can identify the high leverage points correctly but they have a tendency to identify too many low leverage points to be points of high leverages which is not also desired. An attempt has been made to make a compromise between these two approaches. We propose an adaptive method where the suspected high leverage points are identified by robust methods and then the low leverage points (if any) are put back into the estimation data set after diagnostic checking. The usefulness of our newly proposed method for the detection of multiple high leverage points is studied by some well-known data sets and Monte Carlo simulations.

Suggested Citation

  • M. Habshah & M. R. Norazan & A.H.M. Rahmatullah Imon, 2009. "The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 36(5), pages 507-520.
  • Handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:507-520
    DOI: 10.1080/02664760802553463
    as

    Download full text from publisher

    File URL: http://www.tandfonline.com/doi/abs/10.1080/02664760802553463
    Download Restriction: Access to full text is restricted to subscribers.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sung-Soo Kim & Sung Park & W. J. Krzanowski, 2008. "Simultaneous variable selection and outlier identification in linear regression using the mean-shift outlier model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 35(3), pages 283-291.
    2. Sung-Soo Kim & W. Krzanowski, 2007. "Detecting multiple outliers in linear regression using a cluster method combined with graphical visualization," Computational Statistics, Springer, vol. 22(1), pages 109-119, April.
    3. Billor, Nedret & Hadi, Ali S. & Velleman, Paul F., 2000. "BACON: blocked adaptive computationally efficient outlier nominators," Computational Statistics & Data Analysis, Elsevier, vol. 34(3), pages 279-298, September.
    4. Hadi, Ali S., 1992. "A new measure of overall potential influence in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 14(1), pages 1-27, June.
    5. Sebert, David M. & Montgomery, Douglas C. & Rollier, Dwayne A., 1998. "A clustering algorithm for identifying multiple outliers in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 27(4), pages 461-484, June.
    6. A. H. M. Rahmatullah Imon, 2005. "Identifying multiple influential observations in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 32(9), pages 929-946.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. A.A.M. Nurunnabi & M. Nasser & A.H.M.R. Imon, 2016. "Identification and classification of multiple outliers, high leverage points and influential observations in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(3), pages 509-525, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:japsta:v:36:y:2009:i:5:p:507-520. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Chris Longhurst). General contact details of provider: http://www.tandfonline.com/CJAS20 .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.