Adapting a classification rule to local and global shift when only unlabelled data are available

My bibliography Save this article

Adapting a classification rule to local and global shift when only unlabelled data are available

Author

Listed:

Hofer, Vera

Registered:

Abstract

For evolving populations the training data and the test data need not follow the same distribution. Thus, the performance of a prediction model will deteriorate over the course of time. This requires the re-estimation of the prediction model after some time. However, in many applications e.g. credit scoring, new labelled data are not available for re-estimation due to verification latency, i.e. label delay. Thus, methods which enable a prediction model to adapt to distributional changes by using only unlabelled data are highly desirable. A shift adaptation method for binary classification is presented here. The model is based on mixture distributions. The conditional feature distributions are determined at the time where labelled data are available, and the unconditional feature distribution is determined at the time where new unlabelled data are accessible. These mixture distributions provide information on the old and the new positions of subpopulations. A transition model then describes how the subpopulations of each class have drifted to form the new unconditional feature distribution. Assuming that the conditional distributions are reorganised using a minimum of energy, a two-step estimation procedure results. First, for a given class prior distribution the transfer of probability mass is estimated such that the energy required to obtain the new unconditional distribution by a local transfer of the old conditional distributions is a minimum. Since the optimal solution of the resulting transportation problem measures the distance between the old and the new distributions, the change of the class prior distribution is found in a second step by solving the transportation problem for varying class prior distributions and selecting the value for which the objective function is a minimum. Using the solution of the transportation problem and the component parameters of the unconditional feature distribution, the new conditional feature distribution can be determined. This thus allows for a shift adaptation of the classification rule. The performance of the proposed model is investigated using a large real-world dataset on default rates in Danish companies. The results show that the shift adaptation improves classification results.

Suggested Citation

Hofer, Vera, 2015. "Adapting a classification rule to local and global shift when only unlabelled data are available," European Journal of Operational Research, Elsevier, vol. 243(1), pages 177-189.

Handle: RePEc:eee:ejores:v:243:y:2015:i:1:p:177-189
DOI: 10.1016/j.ejor.2014.11.022

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Christophe Biernacki & Farid Beninel & Vincent Bretagnolle, 2002. "A Generalized Discriminant Rule When Training Population and Test Population Differ on Their Descriptive Parameters," Biometrics, The International Biometric Society, vol. 58(2), pages 387-397, June.
P. Scobey & D. G. Kabe, 1981. "Direct Solutions to Some Multidimensional Transportation Problems," Transportation Science, INFORMS, vol. 15(1), pages 1-15, February.
Masashi Sugiyama & Taiji Suzuki & Shinichi Nakajima & Hisashi Kashima & Paul Bünau & Motoaki Kawanabe, 2008. "Direct importance estimation for covariate shift adaptation," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 60(4), pages 699-746, December.
Hand D.J. & Vinciotti V., 2003. "Local Versus Global Models for Classification Problems: Fitting Models Where it Matters," The American Statistician, American Statistical Association, vol. 57, pages 124-131, May.
Hofer, Vera & Krempl, Georg, 2013. "Drift mining in data: A framework for addressing drift in classification," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 377-391.
Yang, Yingxu, 2007. "Adaptive credit scoring with kernel learning methods," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1521-1536, December.
Crook, Jonathan N. & Edelman, David B. & Thomas, Lyn C., 2007. "Recent developments in consumer credit risk assessment," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1447-1465, December.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
Dias, Sónia & Brito, Paula, 2017. "Off the beaten track: A new linear model for interval data," European Journal of Operational Research, Elsevier, vol. 258(3), pages 1118-1130.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Guotai Chi & Zhipeng Zhang, 2017. "Multi Criteria Credit Rating Model for Small Enterprise Using a Nonparametric Method," Sustainability, MDPI, vol. 9(10), pages 1-23, October.
Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
Huseyin Ince & Bora Aktan, 2009. "A comparison of data mining techniques for credit scoring in banking: A managerial perspective," Journal of Business Economics and Management, Taylor & Francis Journals, vol. 10(3), pages 233-240, March.
Huei-Wen Teng & Michael Lee, 2019. "Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(03), pages 1-27, September.
Maria Rocha Sousa & João Gama & Elísio Brandão, 2013. "Introducing time-changing economics into credit scoring," FEP Working Papers 513, Universidade do Porto, Faculdade de Economia do Porto.
Raffaella Calabrese, 2012. "Improving Classifier Performance Assessment of Credit Scoring Models," Working Papers 201204, Geary Institute, University College Dublin.
Barbara CAVALLETTI & Corrado LAGAZIO & Daniela VANDONE, 2008. "Il credito al consumo in Italia: benessere economico o fragilita’ finanziaria?," Departmental Working Papers 2008-24, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
A?da Kammoun & Imen Triki, 2016. "Credit Scoring Models for a Tunisian Microfinance Institution: Comparison between Artificial Neural Network and Logistic Regression," Review of Economics & Finance, Better Advances Press, Canada, vol. 6, pages 61-78, February.
Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
Singh, Ramendra Pratap & Singh, Ramendra & Mishra, Prashant, 2021. "Does managing customer accounts receivable impact customer relationships, and sales performance? An empirical investigation," Journal of Retailing and Consumer Services, Elsevier, vol. 60(C).
Charitou, Andreas & Dionysiou, Dionysia & Lambertides, Neophytos & Trigeorgis, Lenos, 2013. "Alternative bankruptcy prediction models using option-pricing theory," Journal of Banking & Finance, Elsevier, vol. 37(7), pages 2329-2341.
Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.
Sheng, Haiyang & Yu, Guan, 2023. "TNN: A transfer learning classifier based on weighted nearest neighbors," Journal of Multivariate Analysis, Elsevier, vol. 193(C).
Bernd Bischl & Julia Schiffner & Claus Weihs, 2013. "Benchmarking local classification methods," Computational Statistics, Springer, vol. 28(6), pages 2599-2619, December.
Ting Sun & Miklos A. Vasarhelyi, 2018. "Predicting credit card delinquencies: An application of deep neural networks," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 25(4), pages 174-189, October.
Raffaella Calabrese & Galina Andreeva & Jake Ansell, 2019. "“Birds of a Feather” Fail Together: Exploring the Nature of Dependency in SME Defaults," Risk Analysis, John Wiley & Sons, vol. 39(1), pages 71-84, January.
Shuang Zhu & R. Pace, 2014. "Modeling Spatially Interdependent Mortgage Decisions," The Journal of Real Estate Finance and Economics, Springer, vol. 49(4), pages 598-620, November.
Yu Xia & Ta Xu & Ming-Xia Wei & Zhen-Ke Wei & Lian-Jie Tang, 2023. "Predicting Chain’s Manufacturing SME Credit Risk in Supply Chain Finance Based on Machine Learning Methods," Sustainability, MDPI, vol. 15(2), pages 1-18, January.

More about this item

Keywords

Dataset shift; Concept drift; Local drift; Global drift; Verification latency;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:243:y:2015:i:1:p:177-189. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Adapting a classification rule to local and global shift when only unlabelled data are available

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data