IDEAS home Printed from https://ideas.repec.org/a/spr/ijsaem/v13y2022i3d10.1007_s13198-021-01471-7.html
   My bibliography  Save this article

An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation

Author

Listed:
  • Sachin Kumar

    (University of University)

  • Aditya Sharma

    (University of University)

  • B Kartheek Reddy

    (University of University)

  • Shreyas Sachan

    (University of University)

  • Vaibhav Jain

    (University of University)

  • Jagvinder Singh

    (Delhi Technological University)

Abstract

Digital technologies, their product and services have empowered the masses to generate information at a faster pace. Digital technologies based information sharing platforms such as news websites and social media platforms such as Facebook, Twitter, Instagram, What’s app etc have flooded the information space due to the easy generation of information and dissemination to the masses instantly. Information classification has been an important task, especially in newspapers and media organisations. In another area also, information or text classification has an important role to play so that important and vital information can be classified based on the already predefined categories. In journalism, editors and resources persons were allocated the task to recognise and classify the news stories so that they can be placed in the predefined categories of economy and business news, political news, social news, editorial section, education and career, and sports information etc. Nowadays the process of classification and segregation of textual information has become challenging due to the flow of diverse, vast information. Additionally, the pace of information and its updates, access and competition among the media House have made it more challenging. Hence automated and intelligent tools which can classify the information and text accurately and efficiently is needed to reduces human efforts, time and increase productivity. This paper presents an intelligent, efficient and robust intelligent machine learning model based on Multinomial Naive Bayes(MNB) to classify the current affairs news stories. The proposed Inverse Document Frequency(IDF) integrated MNB model achieves classification accuracy of 87.22 per cent. The experiment results are also compared with other machine learning models such as Logistics Regression(LR), Support Vector Machine(SVM), K-Nearest Neighbours(KNN) and Random forest(RF). The results demonstrate that the presented model is better in term of accuracy and may be deployed in real world information classification and media domain to improve the productivity, efficiency of the current affairs news classification process.

Suggested Citation

  • Sachin Kumar & Aditya Sharma & B Kartheek Reddy & Shreyas Sachan & Vaibhav Jain & Jagvinder Singh, 2022. "An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(3), pages 1341-1355, June.
  • Handle: RePEc:spr:ijsaem:v:13:y:2022:i:3:d:10.1007_s13198-021-01471-7
    DOI: 10.1007/s13198-021-01471-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13198-021-01471-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13198-021-01471-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Salminen, Joni & Yoganathan, Vignesh & Corporan, Juan & Jansen, Bernard J. & Jung, Soon-Gyo, 2019. "Machine learning approach to auto-tagging online content for content marketing efficiency: A comparative analysis between methods and content type," Journal of Business Research, Elsevier, vol. 101(C), pages 203-217.
    2. Tarek Kanan & Edward A. Fox, 2016. "Automated arabic text classification with P-Stemmer, machine learning, and a tailored news article taxonomy," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(11), pages 2667-2683, November.
    3. Sachin Kumar & Jagvinder Singh & Ompal Singh, 2020. "Ensemble-based extreme learning machine model for occupancy detection with ambient attributes," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 11(2), pages 173-183, July.
    4. Joachims, Thorsten, 1998. "Making large-scale SVM learning practical," Technical Reports 1998,28, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sanjiban Sekhar Roy & Ali Ismail Awad & Lamesgen Adugnaw Amare & Mabrie Tesfaye Erkihun & Mohd Anas, 2022. "Multimodel Phishing URL Detection Using LSTM, Bidirectional LSTM, and GRU Models," Future Internet, MDPI, vol. 14(11), pages 1-15, November.
    2. Sachin Kumar & Zairu Nisha & Jagvinder Singh & Anuj Kumar Sharma, 2022. "Sensor network driven novel hybrid model based on feature selection and SVR to predict indoor temperature for energy consumption optimisation in smart buildings," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(6), pages 3048-3061, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sachin Kumar & Shivam Panwar & Jagvinder Singh & Anuj Kumar Sharma & Zairu Nisha, 2022. "iCACD: an intelligent deep learning model to categorise current affairs news article for efficient journalistic process," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(5), pages 2572-2582, October.
    2. Luca Zanni, 2006. "An Improved Gradient Projection-based Decomposition Technique for Support Vector Machines," Computational Management Science, Springer, vol. 3(2), pages 131-145, April.
    3. Muhammad Ateeq ur REHMAN & Furman ALI & Shang XIE, 2022. "Impact of Foreign Investment News on the Return, Cost of Equity and Cash Flow Activities," Journal for Economic Forecasting, Institute for Economic Forecasting, vol. 0(4), pages 112-127, December.
    4. Peng Han & Xinyue Yang & Yifei Zhao & Xiangmin Guan & Shengjie Wang, 2022. "Quantitative Ground Risk Assessment for Urban Logistical Unmanned Aerial Vehicle (UAV) Based on Bayesian Network," Sustainability, MDPI, vol. 14(9), pages 1-13, May.
    5. Sachin Kumar & Zairu Nisha & Jagvinder Singh & Anuj Kumar Sharma, 2022. "Sensor network driven novel hybrid model based on feature selection and SVR to predict indoor temperature for energy consumption optimisation in smart buildings," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(6), pages 3048-3061, December.
    6. Andrej Čopar & Blaž Zupan & Marinka Zitnik, 2019. "Fast optimization of non-negative matrix tri-factorization," PLOS ONE, Public Library of Science, vol. 14(6), pages 1-15, June.
    7. Andreé Vela & Joanna Alvarado-Uribe & Hector G. Ceballos, 2021. "Indoor Environment Dataset to Estimate Room Occupancy," Data, MDPI, vol. 6(12), pages 1-12, December.
    8. Hoi-Ming Chi & Okan K. Ersoy & Herbert Moskowitz & Kemal Altinkemer, 2007. "Toward Automated Intelligent Manufacturing Systems (AIMS)," INFORMS Journal on Computing, INFORMS, vol. 19(2), pages 302-312, May.
    9. Andrea Manno & Laura Palagi & Simone Sagratella, 2018. "Parallel decomposition methods for linearly constrained problems subject to simple bound with application to the SVMs training," Computational Optimization and Applications, Springer, vol. 71(1), pages 115-145, September.
    10. Ruchika Malhotra & Megha Khanna, 2023. "On the applicability of search-based algorithms for software change prediction," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 14(1), pages 55-73, February.
    11. Mustak, Mekhail & Salminen, Joni & Plé, Loïc & Wirtz, Jochen, 2021. "Artificial intelligence in marketing: Topic modeling, scientometric analysis, and research agenda," Journal of Business Research, Elsevier, vol. 124(C), pages 389-404.
    12. Tianrui Yin & Wei Chen & Bo Liu & Changzhen Li & Luyao Du, 2023. "Light “You Only Look Once”: An Improved Lightweight Vehicle-Detection Model for Intelligent Vehicles under Dark Conditions," Mathematics, MDPI, vol. 12(1), pages 1-19, December.
    13. Prabowo, Rudy & Thelwall, Mike, 2009. "Sentiment analysis: A combined approach," Journal of Informetrics, Elsevier, vol. 3(2), pages 143-157.
    14. Luminita STATE & Catalina COCIANU & Cristian USCATU & Marinela MIRCEA, 2013. "Extensions of the SVM Method to the Non-Linearly Separable Data," Informatica Economica, Academy of Economic Studies - Bucharest, Romania, vol. 17(2), pages 173-182.
    15. C. J. Lin & S. Lucidi & L. Palagi & A. Risi & M. Sciandrone, 2009. "Decomposition Algorithm Model for Singly Linearly-Constrained Problems Subject to Lower and Upper Bounds," Journal of Optimization Theory and Applications, Springer, vol. 141(1), pages 107-126, April.
    16. Guliyev, Hasraddin & Mustafayev, Eldayag, 2022. "Predicting the changes in the WTI crude oil price dynamics using machine learning models," Resources Policy, Elsevier, vol. 77(C).
    17. Andrea Manno & Laura Palagi & Simone Sagratella, 2014. "A Class of Convergent Parallel Algorithms for SVMs Training," DIAG Technical Reports 2014-17, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
    18. Jakub Horak & Tomas Krulicky & Zuzana Rowland & Veronika Machova, 2020. "Creating a Comprehensive Method for the Evaluation of a Company," Sustainability, MDPI, vol. 12(21), pages 1-23, November.
    19. Giampaolo Liuzzi & Laura Palagi & Mauro Piacentini, 2010. "On the convergence of a Jacobi-type algorithm for Singly Linearly-Constrained Problems Subject to simple Bounds," DIS Technical Reports 2010-01, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
    20. Salminen, Joni & Kandpal, Chandrashekhar & Kamel, Ahmed Mohamed & Jung, Soon-gyo & Jansen, Bernard J., 2022. "Creating and detecting fake reviews of online products," Journal of Retailing and Consumer Services, Elsevier, vol. 64(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:ijsaem:v:13:y:2022:i:3:d:10.1007_s13198-021-01471-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.