Author
Abstract
In this study, we used unidirectional and bidirectional long short-term memory (LSTM) deep learning networks for Chinese news classification and characterized the effects of contextual information on text classification, achieving a high level of accuracy. A Chinese glossary was created using jieba—a word segmentation tool—stop-word removal, and word frequency analysis. Next, word2vec was used to map the processed words into word vectors, creating a convenient lookup table for word vectors that could be used as feature inputs for the LSTM model. A bidirectional LSTM (BiLSTM) network was used for feature extraction from word vectors to facilitate the transfer of information in both the backward and forward directions to the hidden layer. Subsequently, an LSTM network was used to perform feature integration on all the outputs of the BiLSTM network, with the output from the last layer of the LSTM being treated as the mapping of the text into a feature vector. The output feature vectors were then connected to a fully connected layer to construct a feature classifier using the integrated features, finally classifying the news articles. The hyperparameters of the model were optimized based on the loss between the true and predicted values using the adaptive moment estimation (Adam) optimizer. Additionally, multiple dropout layers were added to the model to reduce overfitting. As text classification models for Chinese news articles, the Bi-LSTM and unidirectional LSTM models obtained f1-scores of 94.15% and 93.16%, respectively, with the former outperforming the latter in terms of feature extraction.
Suggested Citation
Chen Liu, 2024.
"Long short-term memory (LSTM)-based news classification model,"
PLOS ONE, Public Library of Science, vol. 19(5), pages 1-23, May.
Handle:
RePEc:plo:pone00:0301835
DOI: 10.1371/journal.pone.0301835
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0301835. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.