Taking Advantage of Turkish Characteristic Features to Achieve Authorship Attribution Problems for Turkish


Saygili N. S., Amghar T., Levrat B., ACARMAN T.

25th Signal Processing and Communications Applications Conference (SIU), Antalya, Türkiye, 15 - 18 Mayıs 2017 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/siu.2017.7960438
  • Basıldığı Şehir: Antalya
  • Basıldığı Ülke: Türkiye
  • Galatasaray Üniversitesi Adresli: Evet

Özet

The rapid increase in the number of the electronic and online texts such as electronic mails, online newspapers and magazines, blog posts and online forum messages has also accelerated the studies carried out on authorship attribution. Although the studies are not as abundant as in English language, there have been considerable studies on author identification in Turkish in the last fifteen years. This study includes two parts; first part is a quick review of Turkish authorship attribution studies. The review is focused on the stylometric features that enable authors to be distinguished one from another. In the second part, we analyze the main characteristics of the Turkish Language and depict our first experiments on Turkish corpora. We experiment. taking advantages of Turkish characteristic features by using frequencies of gerunds, and use Support Vector Machines as learning algorithm.