IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v191y2022ics0047259x22000586.html
   My bibliography  Save this article

Density ratio model with data-adaptive basis function

Author

Listed:
  • Zhang, Archer Gong
  • Chen, Jiahua

Abstract

In many applications, we collect samples from multiple interconnected populations. These population distributions share some latent structure, so it is advantageous to jointly analyze the samples to make efficient inferences on the multiple distributions and their functionals. One effective way to connect the distributions is the density ratio model (DRM). A key ingredient of the DRM is that the log density ratios are linear combinations of prespecified functions; the vector formed by these functions is called the basis function. The benefit of DRM relies on correctly specifying the basis function to a large degree. In applications, the user may not have a complete knowledge to enable a suitable choice of the basis function, and many discussions have been devoted to this topic. In this article, we consider the still open problem of a data-adaptive choice of the basis function that can alleviate the risk of severe model misspecification. We propose a data-adaptive approach to the choice of basis function based on functional principal component analysis. Under some conditions, we show that this approach leads to consistent basis function estimation. Our simulation results show that the proposed adaptive choice can achieve an efficiency gain. We use a real-data example from economics to demonstrate the efficiency gain and the ease of our approach.

Suggested Citation

  • Zhang, Archer Gong & Chen, Jiahua, 2022. "Density ratio model with data-adaptive basis function," Journal of Multivariate Analysis, Elsevier, vol. 191(C).
  • Handle: RePEc:eee:jmvana:v:191:y:2022:i:c:s0047259x22000586
    DOI: 10.1016/j.jmva.2022.105043
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X22000586
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2022.105043?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zhang, Biao, 2006. "Prospective and retrospective analyses under logistic regression models," Journal of Multivariate Analysis, Elsevier, vol. 97(1), pages 211-230, January.
    2. Jiahua Chen & Pengfei Li & Yukun Liu & James V. Zidek, 2021. "Composite empirical likelihood for multisample clustered data," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 33(1), pages 60-81, January.
    3. Konstantinos Fokianos & Irene Kaimi, 2006. "On the Effect of Misspecifying the Density Ratio Model," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 58(3), pages 475-497, September.
    4. Li, Gang & Qin, Jing, 2006. "Analysis of two-sample truncated data using generalized logistic models," Journal of Multivariate Analysis, Elsevier, vol. 97(3), pages 675-697, March.
    5. Dominik Wied & Rafael Weißbach, 2012. "Consistency of the kernel density estimator: a survey," Statistical Papers, Springer, vol. 53(1), pages 1-21, February.
    6. Xuze Zhang & Saumyadipta Pyne & Benjamin Kedem, 2020. "Estimation of residential radon concentration in Pennsylvania counties by data fusion," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 36(6), pages 1094-1110, November.
    7. Wang, Chunlin & Marriott, Paul & Li, Pengfei, 2018. "Semiparametric inference on the means of multiple nonnegative distributions with excess zero observations," Journal of Multivariate Analysis, Elsevier, vol. 166(C), pages 182-197.
    8. Wu, Jingjing & Karunamuni, Rohana & Zhang, Biao, 2010. "Minimum Hellinger distance estimation in a two-sample semiparametric model," Journal of Multivariate Analysis, Elsevier, vol. 101(5), pages 1102-1122, May.
    9. Marchese, Scott & Diao, Guoqing, 2017. "Density ratio model for multivariate outcomes," Journal of Multivariate Analysis, Elsevier, vol. 154(C), pages 249-261.
    10. Zhang, Biao, 2002. "Assessing Goodness-of-Fit of Generalized Logit Models Based on Case-Control Data," Journal of Multivariate Analysis, Elsevier, vol. 82(1), pages 17-38, July.
    11. Konstantinos Fokianos, 2004. "Merging information for semiparametric density estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(4), pages 941-958, November.
    12. Miguel de Carvalho & Anthony C. Davison, 2014. "Spectral Density Ratio Models for Multivariate Extremes," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 764-776, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Archer Gong Zhang & Jiahua Chen, 2023. "Optimal Estimation under a Semiparametric Density Ratio Model," Papers 2309.09103, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. OrI Davidov & Konstantinos Fokianos & George Iliopoulos, 2014. "Semiparametric Inference for the Two-way Layout Under Order Restrictions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(3), pages 622-638, September.
    2. Wang, Chunlin & Marriott, Paul & Li, Pengfei, 2017. "Testing homogeneity for multiple nonnegative distributions with excess zero observations," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 146-157.
    3. Jiang, Shan & Tu, Dongsheng, 2012. "Inference on the probability P(T1," Computational Statistics & Data Analysis, Elsevier, vol. 56(5), pages 1069-1078.
    4. Ori Davidov & Konstantinos Fokianos & George Iliopoulos, 2010. "Order-Restricted Semiparametric Inference for the Power Bias Model," Biometrics, The International Biometric Society, vol. 66(2), pages 549-557, June.
    5. Chuan Hong & Yang Ning & Peng Wei & Ying Cao & Yong Chen, 2017. "A semiparametric model for vQTL mapping," Biometrics, The International Biometric Society, vol. 73(2), pages 571-581, June.
    6. Giovanni Paolo Crespi & Elisa Mastrogiacomo, 2020. "Qualitative robustness of set-valued value-at-risk," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 91(1), pages 25-54, February.
    7. R. Zamini & V. Fakoor & M. Sarmad, 2015. "On estimation of a density function in multiplicative censoring," Statistical Papers, Springer, vol. 56(3), pages 661-676, August.
    8. Daniela Castro Camilo & Miguel de Carvalho & Jennifer Wadsworth, 2017. "Time-Varying Extreme Value Dependence with Application to Leading European Stock Markets," Papers 1709.01198, arXiv.org.
    9. Ouafae Benrabah & Elias Ould Saïd & Abdelkader Tatachak, 2015. "A kernel mode estimate under random left truncation and time series model: asymptotic normality," Statistical Papers, Springer, vol. 56(3), pages 887-910, August.
    10. Meng Yuan & Chunlin Wang & Boxi Lin & Pengfei Li, 2022. "Semiparametric inference on general functionals of two semicontinuous populations," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(3), pages 451-472, June.
    11. David Atienza & Pedro Larrañaga & Concha Bielza, 2022. "Rejoinder on: Hybrid semiparametric Bayesian networks," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(2), pages 344-347, June.
    12. Jingjing Wu & Rohana J. Karunamuni, 2018. "Efficient and robust tests for semiparametric models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(4), pages 761-788, August.
    13. Mhalla, Linda & Chavez-Demoulin, Valérie & Naveau, Philippe, 2017. "Non-linear models for extremal dependence," Journal of Multivariate Analysis, Elsevier, vol. 159(C), pages 49-66.
    14. Rafael Weißbach & Wladislaw Poniatowski & Walter Krämer, 2013. "Nearest neighbor hazard estimation with left-truncated duration data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 97(1), pages 33-47, January.
    15. Marchese, Scott & Diao, Guoqing, 2018. "Joint regression analysis of mixed-type outcome data via efficient scores," Computational Statistics & Data Analysis, Elsevier, vol. 125(C), pages 156-170.
    16. Qingguo Tang & R. J. Karunamuni, 2018. "Robust variable selection for finite mixture regression models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(3), pages 489-521, June.
    17. Mukherjee, Bhramar & Liu, Ivy, 2009. "A note on bias due to fitting prospective multivariate generalized linear models to categorical outcomes ignoring retrospective sampling schemes," Journal of Multivariate Analysis, Elsevier, vol. 100(3), pages 459-472, March.
    18. Yufan Wang & Xingzhong Xu, 2023. "Homogeneity Test for Multiple Semicontinuous Data with the Density Ratio Model," Mathematics, MDPI, vol. 11(17), pages 1-28, September.
    19. Tang, Qingguo & Karunamuni, Rohana J., 2013. "Minimum distance estimation in a finite mixture regression model," Journal of Multivariate Analysis, Elsevier, vol. 120(C), pages 185-204.
    20. Nassira Menni & Abdelkader Tatachak, 2018. "A note on estimating the conditional expectation under censoring and association: strong uniform consistency," Statistical Papers, Springer, vol. 59(3), pages 1009-1030, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:191:y:2022:i:c:s0047259x22000586. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.