Measuring the effects of emojis on Turkish context in sentiment analysis


Yurtoz C. U., PARLAK İ. B.

7th International Symposium on Digital Forensics and Security (ISDFS), Barcelona, İspanya, 10 - 12 Haziran 2019 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/isdfs.2019.8757554
  • Basıldığı Şehir: Barcelona
  • Basıldığı Ülke: İspanya
  • Anahtar Kelimeler: sentiment detection, emoji classification, natural language processing, machine learning, social networks
  • Galatasaray Üniversitesi Adresli: Evet

Özet

Automatic detection of sentiments is considered among complex problems in social applications. In information security, emojis are used in several interfaces for user authentication, antropomorphic secure access and remote communication. The use of emojis in multimodal information triggers new challenges in complex networks and mobile security applications. The fast growth of social media, microblogs, floods expands the definition of sentimental analysis where the extraction of emotions from user posts becomes a cutting edge. Therefore, the opinion mining becomes a crucial step for the analysis of social behaviour in individuals or groups for the detection of trends. In current applications, the language of emojis is considered as a common way or an interlingua to express the ideas or intensify feelings. However, there are few studies to reveal its effects on Turkish context for overlapped and separate senses. In this study, emojis have been classified as a parameter of textual descriptions for the emotions in Turkish language. The emotion analysis has been performed by Support Vector Machines (SVM) and multinomial Naive Bayes (NB) using test and train sets derived from Twitter corpus. The preparation and preprocessing of the corpus have been accomplished by generating the classifiers; groups and emotions. The neutral emotion state has been also added to compare the accuracy levels in classification. The use of corpus in a generic domain present a promising field where different emotion states have been measured. The evaluation scores indicate that SVM would perform better and neutral emotional emojis might decrease total accuracy in Turkish language.