Detection of Malware using the DHP Algorithm and Logistic Regression Analysis

Main Article Content

Yeongji Ju
Juhyun Shin

Keywords

Detection of Malware, Association rule mining, DHP Algorithm, Logistic Regression Analysis

Abstract

The proliferation of computer networks has helped to further develop the software industry. However,
this has been accompanied by an increase in the numbers of several types of malware. Therefore,
research efforts have been directed towards detecting malware’ actions and identifying certain execution
files based on their Application Programming Interface (API) data. The majority of contemporary
antivirus programs employ a signature detection technique; however, the number of signatures
is very limited whereas the number of malware is increasing rapidly, which leads to a very high
false detection rate. To address this issue, In this paper suggests a Malware analysis and detection
method using an association rule mining algorithm and logistic regression analysis. By using the
Direct Hashing and Pruning (DHP) algorithm, the API of the malware and the normal codes within
a Portable Executable (PE) file are compiled as a hash table. Association pattern rules are probed to
group the patterns. theassociation rule patterns extracted through this research reduced false detection
rates when classificationwas carried out using the logistic regression analysis, and the discrimination
result was shown to begreater than 0.7.