Novel Methods for Multivariate Ordinal Data applied to Genetic Diplotypes, Genomic Pathways, Risk Profiles, and Pattern Similarity

My bibliography Save this paper

Novel Methods for Multivariate Ordinal Data applied to Genetic Diplotypes, Genomic Pathways, Risk Profiles, and Pattern Similarity

Author

Listed:

Wittkowski, Knut M.

Registered:

Knut M. Wittkowski

Abstract

Introduction: Conventional statistical methods for multivariate data (e.g., discriminant/regression) are based on the (generalized) linear model, i.e., the data are interpreted as points in a Euclidian space of independent dimensions. The dimensionality of the data is then reduced by assuming the components to be related by a specific function of known type (linear, exponential, etc.), which allows the distance of each point from a hyperspace to be determined. While mathematically elegant, these approaches may have shortcomings when applied to real world applications where the relative importance, the functional relationship, and the correlation among the variables tend to be unknown. Still, in many applications, each variable can be assumed to have at least an “orientation”, i.e., it can reasonably assumed that, if all other conditions are held constant, an increase in this variable is either “good” or “bad”. The direction of this orientation can be known or unknown. In genetics, for instance, having more “abnormal” alleles may increase the risk (or magnitude) of a disease phenotype. In genomics, the expression of several related genes may indicate disease activity. When screening for security risks, more indicators for atypical behavior may constitute raise more concern, in face or voice recognition, more indicators being similar may increase the likelihood of a person being identified. Methods: In 1998, we developed a nonparametric method for analyzing multivariate ordinal data to assess the overall risk of HIV infection based on different types of behavior or the overall protective effect of barrier methods against HIV infection. By using u-statistics, rather than the marginal likelihood, we were able to increase the computational efficiency of this approach by several orders of magnitude. Results: We applied this approach to assessing immunogenicity of a vaccination strategy in cancer patients. While discussing the pitfalls of the conventional methods for linking quantitative traits to haplotypes, we realized that this approach could be easily modified into to a statistically valid alternative to a previously proposed approaches. We have now begun to use the same methodology to correlate activity of anti-inflammatory drugs along genomic pathways with disease severity of psoriasis based on several clinical and histological characteristics. Conclusion: Multivariate ordinal data are frequently observed to assess semiquantitative characteristics, such as risk profiles (genetic, genomic, or security) or similarity of pattern (faces, voices, behaviors). The conventional methods require empirical validation, because the functions and weights chosen cannot be justified on theoretical grounds. The proposed statistical method for analyzing profiles of ordinal variables, is intrinsically valid. Since no additional assumptions need to be made, the often time-consuming empirical validation can be skipped.

Suggested Citation

Wittkowski, Knut M., 2003. "Novel Methods for Multivariate Ordinal Data applied to Genetic Diplotypes, Genomic Pathways, Risk Profiles, and Pattern Similarity," MPRA Paper 4570, University Library of Munich, Germany.

Handle: RePEc:pra:mprapa:4570

Download full text from publisher

References listed on IDEAS

Quinn McNemar, 1947. "Note on the sampling error of the difference between correlated proportions or percentages," Psychometrika, Springer;The Psychometric Society, vol. 12(2), pages 153-157, June.
Wittowski, K.M. & Susser, E. & Dietz, K., 1998. "Erratum: The protective effect of condoms and nonoxynol-9 against HIV infection (American Journal of Public Health (1998) 88 (590-596))," American Journal of Public Health, American Public Health Association, vol. 88(6), pages 972-972.
Wittkowski, K.M. & Susser, E. & Dietz, K., 1998. "The protective effect of condoms and nonoxynol-9 against HIV infection," American Journal of Public Health, American Public Health Association, vol. 88(4), pages 590-596.
Li K-C. & Aragon Y. & Shedden K. & Thomas Agnan C., 2003. "Dimension Reduction for Multivariate Response Data," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 99-109, January.
Susser, E. & Desvarieux, M. & Wittkowski, K.M., 1998. "Reporting sexual risk behavior for HIV: A practical risk index and a method for improving risk indices," American Journal of Public Health, American Public Health Association, vol. 88(4), pages 671-674.
Dianne M. Finkelstein & William B. Goggins & David A. Schoenfeld, 2002. "Analysis of Failure Time Data with Dependent Interval Censoring," Biometrics, The International Biometric Society, vol. 58(2), pages 298-304, June.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Jan Ours & Frederic Vermeulen, 2007. "Ranking Dutch Economists," De Economist, Springer, vol. 155(4), pages 469-487, December.
- van Ours, J.C. & Vermeulen, F.M.P., 2007. "Ranking Dutch Economists," Other publications TiSEM 9866ce91-c4e0-44e2-918b-3, Tilburg University, School of Economics and Management.
- van Ours, J.C. & Vermeulen, F.M.P., 2007. "Ranking Dutch Economists," Discussion Paper 2007-72, Tilburg University, Center for Economic Research.
- van Ours, J.C. & Vermeulen, F.M.P., 2007. "Ranking Dutch economists," Other publications TiSEM 22ef61f4-2610-4223-a75b-7, Tilburg University, School of Economics and Management.
Wittkowski Knut M. & Song Tingting & Anderson Kent & Daniels John E., 2008. "U-Scores for Multivariate Data in Sports," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 4(3), pages 1-24, July.
Wittkowski, Knut M., 2005. "Towards Novel Nonparametric Statistical Methods and Bioinformatics Tools for Clinical and Translational Sciences," MPRA Paper 5902, University Library of Munich, Germany.
Scott Beaulier & Robert Elder, 2011. "Using â€˜â€˜Dominetricsâ€™â€™ to Impose Greater Discipline on Performance Rankings," Journal of Sports Economics, , vol. 12(1), pages 55-80, February.
Morales José F. & Song Tingting & Auerbach Arleen D. & Wittkowski Knut M., 2008. "Phenotyping Genetic Diseases Using an Extension of µ-Scores for Multivariate Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-20, June.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Uttam Bandyopadhyay & Atanu Biswas & Shirsendu Mukherjee, 2009. "Adaptive two-treatment two-period crossover design for binary treatment responses incorporating carry-over effects," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 18(1), pages 13-33, March.
Preety Srivastava & Xueyan Zhao, 2010. "What Do the Bingers Drink? Micro‐Unit Evidence on Negative Externalities and Drinker Characteristics of Alcohol Consumption by Beverage Types," Economic Papers, The Economic Society of Australia, vol. 29(2), pages 229-250, June.
- Preety Srivastava, 2010. "What Do the Bingers Drink? Micro-unit Evidence on Negative Externalities and Drinker Characteristics of Alcohol Consumption by Beverage Types," Wine Economics Research Centre Working Papers 2010-07, University of Adelaide, Wine Economics Research Centre.
Noorbaloochi, Siamak & Nelson, David, 2008. "Conditionally specified models and dimension reduction in the exponential families," Journal of Multivariate Analysis, Elsevier, vol. 99(8), pages 1574-1589, September.
Holger Schwender & Margaret A. Taub & Terri H. Beaty & Mary L. Marazita & Ingo Ruczinski, 2012. "Rapid Testing of SNPs and Gene–Environment Interactions in Case–Parent Trio Data Based on Exact Analytic Parameter Estimation," Biometrics, The International Biometric Society, vol. 68(3), pages 766-773, September.
Pantazis, Nikos & Kenward, Michael G. & Touloumi, Giota, 2013. "Performance of parametric survival models under non-random interval censoring: A simulation study," Computational Statistics & Data Analysis, Elsevier, vol. 63(C), pages 16-30.
Matysková, Ludmila & Rogers, Brian & Steiner, Jakub & Sun, Keh-Kuan, 2020. "Habits as adaptations: An experimental study," Games and Economic Behavior, Elsevier, vol. 122(C), pages 391-406.
- Steiner, Jakub & Matyskova, Ludmila & Rogers, Brian & Sun, Keh-Kuan, 2018. "Habits as Adaptations: An Experimental Study," CEPR Discussion Papers 13300, C.E.P.R. Discussion Papers.
- Ludmila Matyskova & Brian Rogers & Jakub Steiner & Keh-Kuan Sun, 2019. "Habits as Adaptations: An Experimental Study," CRC TR 224 Discussion Paper Series crctr224_2019_113, University of Bonn and University of Mannheim, Germany.
André, Kévin, 2013. "Applying the Capability Approach to the French Education System: An Assessment of the "Pourquoi pas moi ?"," ESSEC Working Papers WP1316, ESSEC Research Center, ESSEC Business School.
Irina Zrnić Novaković & Dean Ajduković & Helena Bakić & Camila Borges & Margarida Figueiredo-Braga & Annett Lotzin & Xenia Anastassiou-Hadjicharalambous & Chrysanthi Lioupi & Jana Darejan Javakhishvil, 2023. "Shaped by the COVID-19 pandemic: Psychological responses from a subjective perspective–A longitudinal mixed-methods study across five European countries," PLOS ONE, Public Library of Science, vol. 18(4), pages 1-32, April.
Melo, Grace & Palma, Marco & Chomali, Laura & Ribera, Luis, 2025. "Are experts overoptimistic about the success of market labeling information?," 2025 AAEA & WAEA Joint Annual Meeting, July 27-29, 2025, Denver, CO 360812, Agricultural and Applied Economics Association.
Ruiz-Frau, A. & Krause, T. & Marbà, N., 2018. "The use of sociocultural valuation in sustainable environmental management," Ecosystem Services, Elsevier, vol. 29(PA), pages 158-167.
Morales José F. & Song Tingting & Auerbach Arleen D. & Wittkowski Knut M., 2008. "Phenotyping Genetic Diseases Using an Extension of µ-Scores for Multivariate Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-20, June.
Yexin Tian & Shuo Xu & Yuchen Cao & Zhongyan Wang & Zijing Wei, 2025. "An Empirical Comparison of Machine Learning and Deep Learning Models for Automated Fake News Detection," Mathematics, MDPI, vol. 13(13), pages 1-24, June.
AlMalki, Hameeda A. & Durugbo, Christopher M., 2023. "Evaluating critical institutional factors of Industry 4.0 for education reform," Technological Forecasting and Social Change, Elsevier, vol. 188(C).
Guevara, C. Angelo & Fukushi, Mitsuyoshi, 2016. "Modeling the decoy effect with context-RUM Models: Diagrammatic analysis and empirical evidence from route choice SP and mode choice RP case studies," Transportation Research Part B: Methodological, Elsevier, vol. 93(PA), pages 318-337.
repec:ags:aaea22:343870 is not listed on IDEAS
Yoo, Jae Keun, 2008. "Sufficient dimension reduction for the conditional mean with a categorical predictor in multivariate regression," Journal of Multivariate Analysis, Elsevier, vol. 99(8), pages 1825-1839, September.
Alexandra I. Khalyasmaa & Pavel V. Matrenin & Stanislav A. Eroshenko & Vadim Z. Manusov & Andrey M. Bramm & Alexey M. Romanov, 2022. "Data Mining Applied to Decision Support Systems for Power Transformers’ Health Diagnostics," Mathematics, MDPI, vol. 10(14), pages 1-25, July.
Arnaldo Rabello de Aguiar Vallim Filho & Daniel Farina Moraes & Marco Vinicius Bhering de Aguiar Vallim & Leilton Santos da Silva & Leandro Augusto da Silva, 2022. "A Machine Learning Modeling Framework for Predictive Maintenance Based on Equipment Load Cycle: An Application in a Real World Case," Energies, MDPI, vol. 15(10), pages 1-41, May.
Alireza Taheri Dehkordi & Mohammad Javad Valadan Zoej & Hani Ghasemi & Ebrahim Ghaderpour & Quazi K. Hassan, 2022. "A New Clustering Method to Generate Training Samples for Supervised Monitoring of Long-Term Water Surface Dynamics Using Landsat Data through Google Earth Engine," Sustainability, MDPI, vol. 14(13), pages 1-24, June.
Lahtinen, Tuomas J. & Hämäläinen, Raimo P., 2016. "Path dependence and biases in the even swaps decision analysis method," European Journal of Operational Research, Elsevier, vol. 249(3), pages 890-898.
Melo, Grace & Palma, Marco A. & Ribera, Luis A., 2024. "Are experts overoptimistic about the success of food market labeling information?," 2024 Annual Meeting, July 28-30, New Orleans, LA 343870, Agricultural and Applied Economics Association.

More about this item

Keywords

; ; ; ; ;

JEL classification:

C35 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions
C44 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Operations Research; Statistical Decision Theory
C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:4570. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Novel Methods for Multivariate Ordinal Data applied to Genetic Diplotypes, Genomic Pathways, Risk Profiles, and Pattern Similarity

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

JEL classification:

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data