Deep learning for effective Android malware detection using API call graph embeddings

Pektas A., ACARMAN T.

SOFT COMPUTING, vol.24, no.2, pp.1027-1043, 2020 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 24 Issue: 2
  • Publication Date: 2020
  • Doi Number: 10.1007/s00500-019-03940-5
  • Title of Journal : SOFT COMPUTING
  • Page Numbers: pp.1027-1043
  • Keywords: Android malware, Deep learning, Graph embedding, Hyper-parameter tuning, API call graph


High penetration of Android applications along with their malicious variants requires efficient and effective malware detection methods to build mobile platform security. API call sequence derived from API call graph structure can be used to model application behavior accurately. Behaviors are extracted by following the API call graph, its branching, and order of calls. But identification of similarities in graphs and graph matching algorithms for classification is slow, complicated to be adopted to a new domain, and their results may be inaccurate. In this study, the authors use the API call graph as a graph representation of all possible execution paths that a malware can track during its runtime. The embedding of API call graphs transformed into a low dimension numeric vector feature set is introduced to the deep neural network. Then, similarity detection for each binary function is trained and tested effectively. This study is also focused on maximizing the performance of the network by evaluating different embedding algorithms and tuning various network configuration parameters to assure the best combination of the hyper-parameters and to reach at the highest statistical metric value. Experimental results show that the presented malware classification is reached at 98.86% level in accuracy, 98.65% in F-measure, 98.47% in recall and 98.84% in precision, respectively.