Thesis Details
Classification of Potentially Malicious File Clusters via Machine Learning
This thesis proposes an alternative to currently used malware classification approaches on the file-level often based on the detection of specific byte sequences. The experimentation proved that a cluster-level classification based on the shared properties of files in the cluster is possible. That was achieved by a careful selection of the properties of the three file types - PE, APK and .NET. By comparing various machine learning methods the highest scoring classifiers were selected and a web service providing API for classification was implemented, which was used for the integration with the internal clustering system of the Avast company. This thesis also discusses drawbacks of the proposed approach and suggests steps for improving the classification.
machine learning, clustering, classification, antivirus, analysis, malware
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Češka Milan, doc. RNDr., Ph.D. (DITS FIT BUT), člen
Kreslíková Jitka, doc. RNDr., CSc. (DIFS FIT BUT), člen
Szőke Igor, Ing., Ph.D. (DCGM FIT BUT), člen
@bachelorsthesis{FITBT21927, author = "Patrik Holop", type = "Bachelor's thesis", title = "Classification of Potentially Malicious File Clusters via Machine Learning", school = "Brno University of Technology, Faculty of Information Technology", year = 2019, location = "Brno, CZ", language = "english", url = "https://www.fit.vut.cz/study/thesis/21927/" }