IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v104y2016icp79-90.html
   My bibliography  Save this article

Partial identification in the statistical matching problem

Author

Listed:
  • Ahfock, Daniel
  • Pyne, Saumyadipta
  • Lee, Sharon X.
  • McLachlan, Geoffrey J.

Abstract

The statistical matching problem involves the integration of multiple datasets where some variables are not observed jointly. This missing data pattern leaves most statistical models unidentifiable. Statistical inference is still possible when operating under the framework of partially identified models, where the goal is to bound the parameters rather than to estimate them precisely. In many matching problems, developing feasible bounds on the parameters is equivalent to finding the set of positive-definite completions of a partially specified covariance matrix. Existing methods for characterising the set of possible completions do not extend to high-dimensional problems. A Gibbs sampler to draw from the set of possible completions is proposed. The variation in the observed samples gives an estimate of the feasible region of the parameters. The Gibbs sampler extends easily to high-dimensional statistical matching problems.

Suggested Citation

  • Ahfock, Daniel & Pyne, Saumyadipta & Lee, Sharon X. & McLachlan, Geoffrey J., 2016. "Partial identification in the statistical matching problem," Computational Statistics & Data Analysis, Elsevier, vol. 104(C), pages 79-90.
  • Handle: RePEc:eee:csdana:v:104:y:2016:i:c:p:79-90
    DOI: 10.1016/j.csda.2016.06.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947316301426
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2016.06.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Marco Ballin & Mauro Scanu & Paola Vicard, 2006. "Paradata and Bayesian networks: a tool for monitoring and troubleshooting the data production process," Departmental Working Papers of Economics - University 'Roma Tre' 0066, Department of Economics - University Roma Tre.
    2. Tamer, Elie, 2010. "Partial Identification in Econometrics," Scholarly Articles 34728615, Harvard University Department of Economics.
    3. Rubin, Donald B, 1986. "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 4(1), pages 87-94, January.
    4. Moriarity, Chris & Scheuren, Fritz, 2003. "A Note on Rubin's Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 65-73, January.
    5. Ding, Wei & Song, Peter X.-K., 2016. "EM algorithm in Gaussian copula with missing data," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 1-11.
    6. Arellano-Valle, Reinaldo B. & Genton, Marc G., 2005. "On fundamental skew distributions," Journal of Multivariate Analysis, Elsevier, vol. 96(1), pages 93-116, September.
    7. Hyungsik Roger Moon & Frank Schorfheide, 2012. "Bayesian and Frequentist Inference in Partially Identified Models," Econometrica, Econometric Society, vol. 80(2), pages 755-782, March.
    8. Rodgers, Willard L, 1984. "An Evaluation of Statistical Matching," Journal of Business & Economic Statistics, American Statistical Association, vol. 2(1), pages 91-102, January.
    9. Michael S. Smith & Quan Gan & Robert J. Kohn, 2012. "Modelling dependence using skew t copulas: Bayesian inference and applications," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 27(3), pages 500-522, April.
    10. Sharon Lee & Geoffrey McLachlan, 2013. "On mixtures of skew normal and skew $$t$$ -distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(3), pages 241-266, September.
    11. Lin, Tsung I. & Ho, Hsiu J. & Chen, Chiang L., 2009. "Analysis of multivariate skew normal models with incomplete data," Journal of Multivariate Analysis, Elsevier, vol. 100(10), pages 2337-2351, November.
    12. Reinaldo B. Arellano‐Valle & Adelchi Azzalini, 2006. "On the Unification of Families of Skew‐normal Distributions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 33(3), pages 561-574, September.
    13. Elie Tamer, 2010. "Partial Identification in Econometrics," Annual Review of Economics, Annual Reviews, vol. 2(1), pages 167-195, September.
    14. Branco, Márcia D. & Dey, Dipak K., 2001. "A General Class of Multivariate Skew-Elliptical Distributions," Journal of Multivariate Analysis, Elsevier, vol. 79(1), pages 99-113, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ahfock, Daniel & Pyne, Saumyadipta & McLachlan, Geoffrey J., 2022. "Statistical file-matching of non-Gaussian data: A game theoretic approach," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    2. Azzalini, Adelchi, 2022. "An overview on the progeny of the skew-normal family— A personal perspective," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    3. Michael S. Rendall & Bonnie Ghosh-Dastidar & Margaret M. Weden & Zafar Nazarov, 2011. "Multiple Imputation for Combined-Survey Estimation With Incomplete Regressors In One But Not Both Surveys," Working Papers WR-887-1, RAND Corporation.
    4. Lee, Sharon X. & McLachlan, Geoffrey J., 2022. "An overview of skew distributions in model-based clustering," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    5. Yuan Liao & Anna Simoni, 2012. "Semi-parametric Bayesian Partially Identified Models based on Support Function," Papers 1212.3267, arXiv.org, revised Nov 2013.
    6. Sasaki, Yuya & Takahashi, Yuya & Xin, Yi & Hu, Yingyao, 2023. "Dynamic discrete choice models with incomplete data: Sharp identification," Journal of Econometrics, Elsevier, vol. 236(1).
    7. Yin, Chuancun & Balakrishnan, Narayanaswamy, 2024. "Stochastic representations and probabilistic characteristics of multivariate skew-elliptical distributions," Journal of Multivariate Analysis, Elsevier, vol. 199(C).
    8. Cabral, Celso Rômulo Barbosa & Lachos, Víctor Hugo & Zeller, Camila Borelli, 2014. "Multivariate measurement error models using finite mixtures of skew-Student t distributions," Journal of Multivariate Analysis, Elsevier, vol. 124(C), pages 179-198.
    9. Kim, Hyoung-Moon & Genton, Marc G., 2011. "Characteristic functions of scale mixtures of multivariate skew-normal distributions," Journal of Multivariate Analysis, Elsevier, vol. 102(7), pages 1105-1117, August.
    10. Brendan Kline & Elie Tamer, 2016. "Bayesian inference in a class of partially identified models," Quantitative Economics, Econometric Society, vol. 7(2), pages 329-366, July.
    11. McLachlan, Geoffrey J. & Lee, Sharon X., 2016. "Comment on “On nomenclature, and the relative merits of two formulations of skew distributions” by A. Azzalini, R. Browne, M. Genton, and P. McNicholas," Statistics & Probability Letters, Elsevier, vol. 116(C), pages 1-5.
    12. C. Adcock, 2010. "Asset pricing and portfolio selection based on the multivariate extended skew-Student-t distribution," Annals of Operations Research, Springer, vol. 176(1), pages 221-234, April.
    13. Arthur Lewbel, 2019. "The Identification Zoo: Meanings of Identification in Econometrics," Journal of Economic Literature, American Economic Association, vol. 57(4), pages 835-903, December.
    14. Stéphane Bonhomme & Martin Weidner, 2019. "Posterior average effects," CeMMAP working papers CWP43/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    15. Kahrari, F. & Rezaei, M. & Yousefzadeh, F. & Arellano-Valle, R.B., 2016. "On the multivariate skew-normal-Cauchy distribution," Statistics & Probability Letters, Elsevier, vol. 117(C), pages 80-88.
    16. Epstein, Larry G. & Seo, Kyoungwon, 2014. "De Finetti meets Ellsberg," Research in Economics, Elsevier, vol. 68(1), pages 11-26.
    17. Arellano-Valle, Reinaldo B. & Ferreira, Clécio S. & Genton, Marc G., 2018. "Scale and shape mixtures of multivariate skew-normal distributions," Journal of Multivariate Analysis, Elsevier, vol. 166(C), pages 98-110.
    18. Bhat, Chandra R. & Astroza, Sebastian & Hamdi, Amin S., 2017. "A spatial generalized ordered-response model with skew normal kernel error terms with an application to bicycling frequency," Transportation Research Part B: Methodological, Elsevier, vol. 95(C), pages 126-148.
    19. Brendan Kline & Elie Tamer, 2024. "Counterfactual Analysis in Empirical Games," Papers 2410.12731, arXiv.org.
    20. Kiesl, Hans & Rässler, Susanne, 2006. "How valid can data fusion be?," IAB-Discussion Paper 200615, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:104:y:2016:i:c:p:79-90. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.