A K-Means Algorithm Application on Big Data

Eren B., Karabulut E. C., ALPTEKİN S. E., Alptekin G. I.

World Congress on Engineering and Computer Science, San-Francisco, Kostarika, 21 - 23 Ekim 2015, ss.814-818, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Basıldığı Şehir: San-Francisco
Basıldığı Ülke: Kostarika
Sayfa Sayıları: ss.814-818
Anahtar Kelimeler: Big data, data mining, clustering, K-means algorithm
Galatasaray Üniversitesi Adresli: Evet

Özet

As more and more data is becoming available due to advances in information and communication technologies, gaining knowledge and insights from this data is replacing experience and intuition based decision making in organizations. Big data mining can be defined as the capability of extracting useful information from massive and complex datasets or data streams. In this paper, one of the commonly used data mining algorithm, K-means, is used to extract information from a big dataset. Doing so, MapReduce framework with Hadoop is used. As the dataset, the results of the social evolution experiment of MIT Human Dynamics Lab are used. The aim is to derive meaningful relationships between students' eating habits and the tendency of getting cold.