Malware classification based on API calls and behaviour analysis


Pektas A., ACARMAN T.

IET INFORMATION SECURITY, vol.12, no.2, pp.107-117, 2018 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 12 Issue: 2
  • Publication Date: 2018
  • Doi Number: 10.1049/iet-ifs.2017.0430
  • Journal Name: IET INFORMATION SECURITY
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.107-117
  • Keywords: learning (artificial intelligence), data mining, application program interfaces, pattern classification, invasive software, malware classification accuracy, baseline classifiers, online machine learning algorithms, classification model, malicious API pattern extraction, Voting Experts algorithm, behaviour-based features, API call sequences, n-gram, mining, application programming interface call, Windows malware, runtime behaviour-based classification procedure, behaviour analysis, malware classification
  • Galatasaray University Affiliated: Yes

Abstract

This study presents the runtime behaviour-based classification procedure for Windows malware. Runtime behaviours are extracted with a particular focus on the determination of a malicious sequence of application programming interface (API) calls in addition to the file, network and registry activities. Mining and searching n-gram over API call sequences is introduced to discover episodes representing behaviour-based features of a malware. Voting Experts algorithm is used to extract malicious API patterns over API calls. The classification model is built by applying online machine learning algorithms and compared with the baseline classifiers. The model is trained and tested with a fairly large set of 17,400 malware samples belonging to 60 distinct families and 532 benign samples. The malware classification accuracy is reached at 98%.