Author
Listed:
- Yuanzi He
(College of Computer Science, Guangdong University of Science and Technology, Dongguan 523083, China)
- Jiali He
(Key Laboratory of Complex System Optimization and Big Data Processing, Department of Guangxi Education, Yulin Normal University, Yulin 537000, China)
- Haotian Liu
(Center for Applied Mathematics of Guangxi, Yulin Normal University, Yulin 537000, China)
- Zhaowen Li
(College of Computer Science, Guangdong University of Science and Technology, Dongguan 523083, China)
Abstract
In machine learning, when the labeled portion of data needs to be processed, a semi-supervised learning algorithm is used. A dataset with missing attribute values or labels is referred to as an incomplete information system. Addressing incomplete information within a system poses a significant challenge, which can be effectively tackled through the application of rough set theory ( R -theory). However, R -theory has its limits: It fails to consider the frequency of an attribute value and then cannot the distribution of attribute values appropriately. If we consider partially labeled data and replace a missing attribute value with the multiset of all possible attribute values under the same attribute, this results in the emergence of partially labeled multiset-valued data. In a semi-supervised learning algorithm, in order to save time and costs, a large number of redundant features need to be deleted. This study proposes semi-supervised attribute selection algorithms for partially labeled multiset-valued data. Initially, a partially labeled multiset-valued decision information system (p-MSVDIS) is partitioned into two distinct systems: a labeled multiset-valued decision information system (l-MSVDIS) and an unlabeled multiset-valued decision information system (u-MSVDIS). Subsequently, using the indistinguishable relation, distinguishable relation, and dependence function, two types of attribute subset importance in a p-MSVDIS are defined: the weighted sum of l-MSVDIS and u-MSVDIS determined by the missing rate of labels, which can be considered an uncertainty measurement (UM) of a p-MSVDIS. Next, two adaptive semi-supervised attribute selection algorithms for a p-MSVDIS are introduced, which leverage the degrees of importance, allowing for automatic adaptation to diverse missing rates. Finally, experiments and statistical analyses are conducted on 11 datasets. The outcome indicates that the proposed algorithms demonstrate advantages over certain algorithms.
Suggested Citation
Yuanzi He & Jiali He & Haotian Liu & Zhaowen Li, 2025.
"Semi-Supervised Attribute Selection Algorithms for Partially Labeled Multiset-Valued Data,"
Mathematics, MDPI, vol. 13(8), pages 1-33, April.
Handle:
RePEc:gam:jmathe:v:13:y:2025:i:8:p:1318-:d:1636992
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:8:p:1318-:d:1636992. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.