5th International Conference on Smart City Applications, SCA 2020, Karabük, Türkiye, 7 - 09 Ekim 2020, cilt.183, ss.1503-1517, (Tam Metin Bildiri)
Over the last decade, the emerging development and popularization of communication tools, as well as the evolution of smart media applications in mobile devices, brought about the transformation of language. Emojis or the lingua franca of social media caused new models in the analysis of language pragmatics. Even if short modalities have existed in the transmission of sentiment context, the unicity of emojis in all languages created new challenges especially in the sentiment spectrum of the exchange of utterances. The discrete nature of emojis brings also a new perspective to label the state of a sentence to cluster lexical features assigned to emojis and to compare them in a multilingual corpora. Thus, smart text processing urges to transform emojis as a multimodal bias to reveal ideas and feelings in unimodal or multimodal approaches. In this study, we created a bilingual Twitter corpora for Arabic and Turkish languages. In the literature, there are few studies highlighting the comparative effects of emojis for both languages. Even though they belong to different language families, similar lexical features become critical for the analysis of sentiment in textual context. Therefore, we aimed to analyze the effects of sentiment classification on Arabic and Turkish tweets. Emojis have been classified as priors of text features. Sentiment analysis has been carried out using multinomial Naive Bayes (NB) and Support Vector Machines (SVM). Our corpora has been prepared using generic expressions without limiting them in a specific field. Consequently, we conclude that evaluation metrics indicate better classification performance for Turkish language through SVM whereas NB is found more accurate for Arabic language.