IDEAS home Printed from https://ideas.repec.org/a/spr/aodasc/v10y2023i5d10.1007_s40745-021-00359-4.html
   My bibliography  Save this article

Part of Speech Tagging Using Part of Speech Sequence Graph

Author

Listed:
  • Pejman Gholami-Dastgerdi

    (University of Tabriz)

  • Mohammad-Reza Feizi-Derakhshi

    (University of Tabriz)

Abstract

Part of speech tagging is one of the most fundamental needs of intelligent text processing, which is assigning the most appropriate grammatical category to each word on the text. Hence, provision of a tagger with high accuracy for the Persian language is the major priority of this article. Numerous other methods of POS tagging have already been presented in a way that each one has been applied in taggers to achieve high performance and accuracy. Statistical methods known as a primary technique and one of the most important issues in POS tagging systems is identifying unknown words. This paper investigates all tags that the Maximum Likelihood Estimation method assigns the words existing in the text (including known and unknown) by proposing a graph-based method and correcting them. To do so, a graph is created from the training corpus including the part of speech sequence in the sentences. Then, sentences tagged with Maximum Likelihood Estimation will be corrected by traversing the graph. It should be noted that different methods have been proposed, implemented, and evaluated for tagging using graphs. Next, by investigating pros and cons, a method is proposed which tags the unknown words with the accuracy of 86.84% and the known words with the accuracy of 97.54%. In conclusion, the overall accuracy of the method is calculated as 96.78%, which is an improvement in comparison to the Maximum Likelihood Estimation method and consequently, the graph method shows an acceptable performance in part of speech tagging and is more reliable.

Suggested Citation

  • Pejman Gholami-Dastgerdi & Mohammad-Reza Feizi-Derakhshi, 2023. "Part of Speech Tagging Using Part of Speech Sequence Graph," Annals of Data Science, Springer, vol. 10(5), pages 1301-1328, October.
  • Handle: RePEc:spr:aodasc:v:10:y:2023:i:5:d:10.1007_s40745-021-00359-4
    DOI: 10.1007/s40745-021-00359-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40745-021-00359-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40745-021-00359-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Feng Liu & Yong Shi, 2020. "Investigating Laws of Intelligence Based on AI IQ Research," Annals of Data Science, Springer, vol. 7(3), pages 399-416, September.
    2. James M. Tien, 2017. "Internet of Things, Real-Time Decision Making, and Artificial Intelligence," Annals of Data Science, Springer, vol. 4(2), pages 149-178, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xueyan Xu & Fusheng Yu & Runjun Wan, 2023. "A Determining Degree-Based Method for Classification Problems with Interval-Valued Attributes," Annals of Data Science, Springer, vol. 10(2), pages 393-413, April.
    2. Anda Tang & Pei Quan & Lingfeng Niu & Yong Shi, 2022. "A Survey for Sparse Regularization Based Compression Methods," Annals of Data Science, Springer, vol. 9(4), pages 695-722, August.
    3. Xingsen Li & Junlin Zeng & Haitao Liu & Peizhuang Wang, 2022. "Intelligent Problem Solving Model and its Cross Research Directions Based on Factor Space and Extenics," Annals of Data Science, Springer, vol. 9(3), pages 469-484, June.
    4. Hui Sun & Fanhui Zeng & Yang Yang, 2022. "Covert Factor’s Exploiting and Factor Planning," Annals of Data Science, Springer, vol. 9(3), pages 449-467, June.
    5. Xiangfu Meng & Jing Wen & Jiasheng Shi & Zihan Li & Jinxia Zhu & Peizhuang Wang, 2022. "Factor Query Language (FQL): A Fundamental Language for the Next Generation of Intelligent Database," Annals of Data Science, Springer, vol. 9(3), pages 539-554, June.
    6. Binxiang Jiang, 2022. "Research on Factor Space Engineering and Application of Evidence Factor Mining in Evidence-based Reconstruction," Annals of Data Science, Springer, vol. 9(3), pages 503-537, June.
    7. Elton G. Aráujo & Julio C. S. Vasconcelos & Denize P. Santos & Edwin M. M. Ortega & Dalton Souza & João P. F. Zanetoni, 2023. "The Zero-Inflated Negative Binomial Semiparametric Regression Model: Application to Number of Failing Grades Data," Annals of Data Science, Springer, vol. 10(4), pages 991-1006, August.
    8. Yundong Gu & Dongfen Ma & Jiawei Cui & Zhenhua Li & Yaqi Chen, 2022. "Variable-Weighted Ensemble Forecasting of Short-Term Power Load Based on Factor Space Theory," Annals of Data Science, Springer, vol. 9(3), pages 485-501, June.
    9. Heba Soltan Mohamed & M. Masoom Ali & Haitham M. Yousof, 2023. "The Lindley Gompertz Model for Estimating the Survival Rates: Properties and Applications in Insurance," Annals of Data Science, Springer, vol. 10(5), pages 1199-1216, October.
    10. Roberto Moro-Visconti & Salvador Cruz Rambaud & Joaquín López Pascual, 2023. "Artificial intelligence-driven scalability and its impact on the sustainability and valuation of traditional firms," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-14, December.
    11. M. Sridharan, 2023. "Generalized Regression Neural Network Model Based Estimation of Global Solar Energy Using Meteorological Parameters," Annals of Data Science, Springer, vol. 10(4), pages 1107-1125, August.
    12. Qinghua Zheng & Chutong Yang & Haijun Yang & Jianhe Zhou, 2020. "A Fast Exact Algorithm for Deployment of Sensor Nodes for Internet of Things," Information Systems Frontiers, Springer, vol. 22(4), pages 829-842, August.
    13. Prashant Singh & Prashant Verma & Nikhil Singh, 2022. "Offline Signature Verification: An Application of GLCM Features in Machine Learning," Annals of Data Science, Springer, vol. 9(6), pages 1309-1321, December.
    14. Shah Hussain & Muhammad Qasim Khan, 2023. "Student-Performulator: Predicting Students’ Academic Performance at Secondary and Intermediate Level Using Machine Learning," Annals of Data Science, Springer, vol. 10(3), pages 637-655, June.
    15. A. R. Sherwani & Q. M. Ali, 2023. "Parametric Classification using Fuzzy Approach for Handling the Problem of Mixed Pixels in Ground Truth Data for a Satellite Image," Annals of Data Science, Springer, vol. 10(6), pages 1459-1472, December.
    16. Hui Zheng & Peng LI & Jing HE, 2022. "A Novel Association Rule Mining Method for Streaming Temporal Data," Annals of Data Science, Springer, vol. 9(4), pages 863-883, August.
    17. Rakhal Das & Anjan Mukherjee & Binod Chandra Tripathy, 2022. "Application of Neutrosophic Similarity Measures in Covid-19," Annals of Data Science, Springer, vol. 9(1), pages 55-70, February.
    18. Muhammed Navas Thorakkattle & Shazia Farhin & Athar Ali khan, 2022. "Forecasting the Trends of Covid-19 and Causal Impact of Vaccines Using Bayesian Structural time Series and ARIMA," Annals of Data Science, Springer, vol. 9(5), pages 1025-1047, October.
    19. Siying Guo & Jianxuan Liu & Qiu Wang, 2022. "Effective Learning During COVID-19: Multilevel Covariates Matching and Propensity Score Matching," Annals of Data Science, Springer, vol. 9(5), pages 967-982, October.
    20. Tousifur Rahman & Partha Jyoti Hazarika & M. Masoom Ali & Manash Pratim Barman, 2022. "Three-Inflated Poisson Distribution and its Application in Suicide Cases of India During Covid-19 Pandemic," Annals of Data Science, Springer, vol. 9(5), pages 1103-1127, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aodasc:v:10:y:2023:i:5:d:10.1007_s40745-021-00359-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.