A K-Means Algorithm Application on Big Data


Eren B., Karabulut E. C., ALPTEKİN S. E., Alptekin G. I.

World Congress on Engineering and Computer Science, San-Francisco, Kostarika, 21 - 23 Ekim 2015, ss.814-818 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: San-Francisco
  • Basıldığı Ülke: Kostarika
  • Sayfa Sayıları: ss.814-818
  • Anahtar Kelimeler: Big data, data mining, clustering, K-means algorithm
  • Galatasaray Üniversitesi Adresli: Evet

Özet

As more and more data is becoming available due to advances in information and communication technologies, gaining knowledge and insights from this data is replacing experience and intuition based decision making in organizations. Big data mining can be defined as the capability of extracting useful information from massive and complex datasets or data streams. In this paper, one of the commonly used data mining algorithm, K-means, is used to extract information from a big dataset. Doing so, MapReduce framework with Hadoop is used. As the dataset, the results of the social evolution experiment of MIT Human Dynamics Lab are used. The aim is to derive meaningful relationships between students' eating habits and the tendency of getting cold.