IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i23p4710-d1284382.html
   My bibliography  Save this article

Statistical Study Design for Analyzing Multiple Gene Loci Correlation in DNA Sequences

Author

Listed:
  • Pianpool Kamoljitprapa

    (Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand)

  • Fazil M. Baksh

    (Department of Mathematics and Statistics, University of Reading, Reading RG6 6AH, UK)

  • Andrea De Gaetano

    (Consiglio Nazionale delle Ricerche, CNR-IASI Rome and CNR-IRIB Palermo, 90146 Palermo, Italy
    Distinguished Professor Excellence Program, Department of Biomatics, Óbuda University, 1034 Budapest, Hungary)

  • Orathai Polsen

    (Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand)

  • Piyachat Leelasilapasart

    (Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand)

Abstract

This study presents a novel statistical and computational approach using nonparametric regression, which capitalizes on correlation structure to deal with the high-dimensional data often found in pharmacogenomics, for instance, in Crohn’s inflammatory bowel disease. The empirical correlation between the test statistics, investigated via simulation, can be used as an estimate of noise. The theoretical distribution of −log 10 ( p -value) is used to support the estimation of that optimal bandwidth for the model, which adequately controls type I error rates while maintaining reasonable power. Two proposed approaches, involving normal and Laplace-LD kernels, were evaluated by conducting a case-control study using real data from a genome-wide association study on Crohn’s disease. The study successfully identified single nucleotide polymorphisms on the NOD2 gene associated with the disease. The proposed method reduces the computational burden by approximately 33% with reasonable power, allowing for a more efficient and accurate analysis of genetic variants influencing drug responses. The study contributes to the advancement of statistical methodology for analyzing complex genetic data and is of practical advantage for the development of personalized medicine.

Suggested Citation

  • Pianpool Kamoljitprapa & Fazil M. Baksh & Andrea De Gaetano & Orathai Polsen & Piyachat Leelasilapasart, 2023. "Statistical Study Design for Analyzing Multiple Gene Loci Correlation in DNA Sequences," Mathematics, MDPI, vol. 11(23), pages 1-14, November.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:23:p:4710-:d:1284382
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/23/4710/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/23/4710/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Max de Lima & Gregorio Atuncar, 2011. "A Bayesian method to estimate the optimal bandwidth for multivariate kernel estimator," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 23(1), pages 137-148.
    2. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    3. Adonis Yatchew, 1998. "Nonparametric Regression Techniques in Economics," Journal of Economic Literature, American Economic Association, vol. 36(2), pages 669-721, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tingting Cheng & Jiti Gao & Xibin Zhang, 2019. "Nonparametric localized bandwidth selection for Kernel density estimation," Econometric Reviews, Taylor & Francis Journals, vol. 38(7), pages 733-762, August.
    2. Corak, Miles & Lauzon, Darren, 2009. "Differences in the distribution of high school achievement: The role of class-size and time-in-term," Economics of Education Review, Elsevier, vol. 28(2), pages 189-198, April.
    3. Ichimura, Hidehiko & Todd, Petra E., 2007. "Implementing Nonparametric and Semiparametric Estimators," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 74, Elsevier.
    4. Camilla Mastromarco & Léopold Simar, 2021. "Latent heterogeneity to evaluate the effect of human capital on world technology frontier," Journal of Productivity Analysis, Springer, vol. 55(2), pages 71-89, April.
    5. Koop, Gary & Poirier, Dale J., 2004. "Bayesian variants of some classical semiparametric regression techniques," Journal of Econometrics, Elsevier, vol. 123(2), pages 259-282, December.
    6. Hu, Shuowen & Poskitt, D.S. & Zhang, Xibin, 2012. "Bayesian adaptive bandwidth kernel density estimation of irregular multivariate distributions," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 732-740.
    7. Mehmet Balcilar & Rangan Gupta & Charl Jooste, 2014. "The Growth-Inflation Nexus for the US over 1801-2013: A Semiparametric Approach," Working Papers 201447, University of Pretoria, Department of Economics.
    8. Temel, Tugrul T., 2001. "A Nonparametric Hypothesis Test Via The Bootstrap Resampling," 2001 Annual meeting, August 5-8, Chicago, IL 20600, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).
    9. Alberto Abadie & Guido W. Imbens, 2002. "Simple and Bias-Corrected Matching Estimators for Average Treatment Effects," NBER Technical Working Papers 0283, National Bureau of Economic Research, Inc.
    10. Subal Kumbhakar & Efthymios Tsionas, 2008. "Scale and efficiency measurement using a semiparametric stochastic frontier model: evidence from the U.S. commercial banks," Empirical Economics, Springer, vol. 34(3), pages 585-602, June.
    11. Nuria Gallego & Carlos Llano, 2014. "The Border Effect and the Nonlinear Relationship between Trade and Distance," Review of International Economics, Wiley Blackwell, vol. 22(5), pages 1016-1048, November.
    12. Chamon, Marcos & Schumacher, Julian & Trebesch, Christoph, 2018. "Foreign-Law Bonds: Can They Reduce Sovereign Borrowing Costs?," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 114, pages 164-179.
    13. Malcolm Keswell, 2004. "Non‐Linear Earnings Dynamics In Post‐Apartheid South Africa," South African Journal of Economics, Economic Society of South Africa, vol. 72(5), pages 913-939, December.
    14. Vincenzo Verardi, 2013. "Semiparametric regression in Stata," United Kingdom Stata Users' Group Meetings 2013 14, Stata Users Group.
    15. Daraio, Cinzia & Simar, Leopold & Wilson, Paul, 2019. "Quality and its impact on efficiency," LIDAM Discussion Papers ISBA 2019004, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    16. Fernández-Huertas Moraga, Jesús, 2013. "Understanding different migrant selection patterns in rural and urban Mexico," Journal of Development Economics, Elsevier, vol. 103(C), pages 182-201.
    17. Camelia Minoiu & Sanjay Reddy, 2014. "Kernel density estimation on grouped data: the case of poverty assessment," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 12(2), pages 163-189, June.
    18. Lv, Zhike, 2017. "Intelligence and corruption: An empirical investigation in a non-linear framework," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 69(C), pages 83-91.
    19. Jeffrey Racine, 2015. "Mixed data kernel copulas," Empirical Economics, Springer, vol. 48(1), pages 37-59, February.
    20. Ghodeswar, Archana & Oliver, Matthew E., 2022. "Trading one waste for another? Unintended consequences of fly ash reuse in the Indian electric power sector," Energy Policy, Elsevier, vol. 165(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:23:p:4710-:d:1284382. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.