Author
Listed:
- Mahdi Molaei
- Mohammad-Reza Feizi-Derakhshi
- Mohammad-Ali Balafar
- Jafar Tanha
Abstract
In this paper, we present a novel deep Siamese network with a multi-scale hybrid feature extraction architecture, named DSN-STC (Deep Siamese Network for Short Text Clustering), that significantly improves the clustering of short text. A key innovation of our approach is a specialized transformation mechanism that maps pre-trained word embeddings into cluster-aware text representations. In this new latent space, the proposed model minimizes the overall overlapping between clusters while improving the cohesion within each cluster. This results in considerable improvements in clustering performance. Since short texts inherently contain both sequential context and localized patterns within their limited context, in this paper a hybrid approach is used by combining both recurrent layers and multi-scale convolutional neural networks to maximize the extractable feature sets from their limited context. This architecture allows us to capture the sequential features and local dependencies by recurrent layer and convolutional layers respectively which leads to generating a more accurate and rich representation for each short text. To evaluate our architecture and because our main focus is on clustering Persian short text, several experiments are conducted in which the results show that the DSN-STC outperforms other approaches in clustering accuracy (ACC) and normalized mutual information (NMI) metrics. Also to further test the proposed architecture’s generalizability and adaptability in other languages, DSN-STC is evaluated on 2 English benchmark datasets where it consistently outperformed previous approaches in both metrics. These results highlight the model’s ability to learn robust and cluster-aware feature representations that are highly useful for effective short text clustering.
Suggested Citation
Mahdi Molaei & Mohammad-Reza Feizi-Derakhshi & Mohammad-Ali Balafar & Jafar Tanha, 2026.
"DSN-STC: Leveraging Siamese networks for optimized short text clustering,"
PLOS ONE, Public Library of Science, vol. 21(1), pages 1-29, January.
Handle:
RePEc:plo:pone00:0335709
DOI: 10.1371/journal.pone.0335709
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0335709. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.