Publication Details
Multi-level Sequence Mining Based on GSP
Hlosta Martin, Ing. (DIFS FIT BUT)
Kupčík Jan, Ing. (DIFS FIT BUT)
Zendulka Jaroslav, doc. Ing., CSc. (DIFS FIT BUT)
Hruška Tomáš, prof. Ing., CSc. (DIFS FIT BUT)
Sequence pattern mining, GSP, taxonomy
Mining sequential patterns is an important problem in the field of data mining and many algorithms and optimization techniques have been published to deal with that problem. An GSP algorithm, which is one of them, can be used for mining sequential patterns with some additional constraints, like gaps between items.
Taxonomies can exist upon the items in sequences. It can be applied to mine sequential patterns with items on several hierarchical levels of the taxonomy. If a more general item appears in a pattern, the pattern has higher or at least the same support as the one containing the corresponding specific item. This allows us to mine more patterns with the same minimal support parameter and to reveal new potentially useful patterns. This paper presents a method for mining multi-level sequential patterns. The method is based on the GSP algorithm and generalization of more specific sequences based on the information theory.
@INPROCEEDINGS{FITPUB9647, author = "Michal \v{S}ebek and Martin Hlosta and Jan Kup\v{c}\'{i}k and Jaroslav Zendulka and Tom\'{a}\v{s} Hru\v{s}ka", title = "Multi-level Sequence Mining Based on GSP", pages = "185--190", booktitle = "Proceedings of the Eleventh International Conference on Informatics INFORMATICS'2011", series = "1", year = 2011, location = "Ko\v{s}ice, SK", publisher = "Faculty of Electrical Engineering and Informatics, University of Technology Ko\v{s}ice", ISBN = "978-80-89284-94-8", language = "english", url = "https://www.fit.vut.cz/research/publication/9647" }