IDEAS home Printed from https://ideas.repec.org/a/eur/ejlsjr/130.html
   My bibliography  Save this article

TS Corpus Project: An online Turkish Dictionary and TS DIY Corpus

Author

Listed:
  • Taner Sezer

    (Mersin University)

Abstract

TS Corpus is a free and independent project that aims building Turkish corpora, NLP tools and linguistic datasets. Since 2011, 10 corpora, various NLP tools, a large dataset and an online dictionary has been released. This paper focuses on the “online dictionary†and “ TS do-it-yourself corpus†released by the project. The dictionary data is based on TDK (Turkish Language Society) Contemporary Dictionary. However, the dictionary published serves many enhanced functions at user interface level. But, the main importance of the study is about the results presented to the users upon their queries; the presentation of collocations and tri-grams of the key word searched for. The collocations are harvested from a large Turkish corpus, +760 million tokens and the tri-grams were generated from Turkish Wikipedia pages. The do-it-yourself corpus (TS DIY Corpus), allows users to build their own corpora, modify or delete the uploaded texts and run queries. Users may run queries in different modes, such as “as is†, “starting/ending with†or including; besides advanced query option allows users to run queries with part-of-speech tags and lemmas. The results are given in KWIC (keyword in context) format. Various text classification options such as pubdate, author, domain, genre etc. could be selected during corpus creation. As the number of available Turkish corpora is limited, TS DIY Corpus is applicant to be a useful, well-known and largely used software for the scholars and researchers who wants to use a Turkish corpus or study over Turkish texts of their own.

Suggested Citation

  • Taner Sezer, 2017. "TS Corpus Project: An online Turkish Dictionary and TS DIY Corpus," European Journal of Language and Literature Studies Articles, Revistia Research and Publishing, vol. 3, September.
  • Handle: RePEc:eur:ejlsjr:130
    DOI: 10.26417/ejls.v9i1.p18-24
    as

    Download full text from publisher

    File URL: https://revistia.org/index.php/ejls/article/view/5848
    Download Restriction: no

    File URL: https://revistia.org/files/articles/ejls_v3_i3_17/Taner.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.26417/ejls.v9i1.p18-24?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eur:ejlsjr:130. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Revistia Research and Publishing (email available below). General contact details of provider: https://revistia.org/index.php/ejls .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.