Publication Details

Is Spam Visible in Flow-Level Statistics?

ŽÁDNÍK Martin and MICHLOVSKÝ Zbyněk. Is Spam Visible in Flow-Level Statistics?. Prague: CESNET National Research and Education Network, 2009. ISBN 978-80-904173-4-2.

Czech title

Projevuje se spam ve statistikách síťových toků?

Type

technical report

Language

english

Authors

Žádník Martin, Ing., Ph.D. (DCSY FIT BUT)
Michlovský Zbyněk, Ing. (DITS FIT BUT)

Keywords

network measurement, spam, identification, characteristics

Abstract

This paper investigates feasibility of detection of spam connections using flow statistics collected upon SMTP connections only. To this end, the paper analyzes several days of SMTP communication collected at middle-sized email server. In order to prove that spam connections can be automatically identified at the TCP/IP layer we utilize supervised learning algorithm to construct classifier, in our case the decision tree. The quality of classifier is evaluated and results shows that the flow based statistics contain detectable fingerprint specific to spam connections. Such finding may help with further study of spam behavior in broader manner as the flow statistics can be collected on-line at the backbone links where it is possible to see SMTP traffic for more than one email server.

Published

2009

Pages

67-78

ISBN

978-80-904173-4-2

Publisher

CESNET National Research and Education Network

Place

Prague, CZ

BibTeX

@TECHREPORT{FITPUB9277,
   author = "Martin \v{Z}\'{a}dn\'{i}k and Zbyn\v{e}k Michlovsk\'{y}",
   title = "Is Spam Visible in Flow-Level Statistics?",
   pages = "67--78",
   year = 2009,
   location = "Prague, CZ",
   publisher = "CESNET National Research and Education Network",
   ISBN = "978-80-904173-4-2",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/9277"
}