Thesis Details

Classification of Potentially Malicious File Clusters via Machine Learning

Bachelor's Thesis Student: Holop Patrik Academic Year: 2018/2019 Supervisor: Bartík Vladimír, Ing., Ph.D.

Czech title

Language

English

Abstract

This thesis proposes an alternative to currently used malware classification approaches on the file-level often based on the detection of specific byte sequences. The experimentation proved that a cluster-level classification based on the shared properties of files in the cluster is possible. That was achieved by a careful selection of the properties of the three file types - PE, APK and .NET. By comparing various machine learning methods the highest scoring classifiers were selected and a web service providing API for classification was implemented, which was used for the integration with the internal clustering system of the Avast company. This thesis also discusses drawbacks of the proposed approach and suggests steps for improving the classification.

Keywords

machine learning, clustering, classification, antivirus, analysis, malware

Department

Department of Information Systems FIT BUT

Degree Programme

Information Technology

Files

Status

defended, grade A

Date

12 June 2019

Reviewer

Zendulka Jaroslav, doc. Ing., CSc.

Committee

Hruška Tomáš, prof. Ing., CSc. (DIFS FIT BUT), předseda
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Češka Milan, doc. RNDr., Ph.D. (DITS FIT BUT), člen
Kreslíková Jitka, doc. RNDr., CSc. (DIFS FIT BUT), člen
Szőke Igor, Ing., Ph.D. (DCGM FIT BUT), člen

Citation

HOLOP, Patrik. Classification of Potentially Malicious File Clusters via Machine Learning. Brno, 2019. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2019-06-12. Supervised by Bartík Vladimír. Available from: https://www.fit.vut.cz/study/thesis/21927/

BibTeX

@bachelorsthesis{FITBT21927,
    author = "Patrik Holop",
    type = "Bachelor's thesis",
    title = "Classification of Potentially Malicious File Clusters via Machine Learning",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2019,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/21927/"
}

Theses