Machine Learning of Social Media Data on a Spatio-Temporal Basis


Yeşilbaş B., PARLAK İ. B., ACARMAN T.

8th International Conference on Inventive Communication and Computational Technologies, ICICCT 2024, Coimbatore, Hindistan, 14 - 15 Haziran 2024, cilt.23 LNNS, ss.419-429, (Tam Metin Bildiri) identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 23 LNNS
  • Doi Numarası: 10.1007/978-981-97-7710-5_31
  • Basıldığı Şehir: Coimbatore
  • Basıldığı Ülke: Hindistan
  • Sayfa Sayıları: ss.419-429
  • Anahtar Kelimeler: Deep learning, Disaster management, Machine learning, Natural language processing, Text classification, Topic modeling
  • Galatasaray Üniversitesi Adresli: Evet

Özet

This study exploits crowd-sourcing via social media data to extract information about emergency situations and needs after the earthquake. Using the semi-supervised method, the data has been labeled as either rescue or non-rescue in order to reach a high level of accuracy in detection. After an earthquake, rescue situations are detected on a spatial and temporal basis, along with location and time information provided by tweets. Two destructive earthquakes of magnitudes Mw 7.7 and Mw 7.6 occurred on February 6, 2023, in the Southeast of Turkiye. 53.537 people died, 107.213 people were injured, and several buildings were damaged. A total of 2.5 million tweets were collected from February 6 to February 28, 2023. For labeling purposes, nine BERT language models that are based on attention and transformers were used. The supervised learning methods, including logistic regression, support vector machines, decision trees, multinomial Naïve Bayes, and XGBoost, were applied to assess the precision of the labels and perform classification. Furthermore, the dataset is processed with deep learning methods: convolutional neural networks, deep neural networks, and long short-term memory. The accuracy of data toward the detection of rescue and non-rescue situations is compared, and keywords on a spatio-temporal basis are extracted to determine hazard situations and emergency needs for coordination purposes. Deep learning and BERT models for detection of rescue and non-rescue classes reach a level of 0.8912 and 0.9792 in recall, respectively.