Publication Details
Generator of Synthetic Datasets for Hierarchical Sequential Pattern Mining Evaluation
Sequence pattern mining, synthetic dataset generators, taxonomy
Evaluation is an important part of algorithm design. Algorithms are typically evaluated on real-world and synthetic datasets. Real-world datasets are appropriate for evaluation of algorithm properties in practice but it is difficult to change the dataset to have some particular statistics, e.g. number of input items. In contrast, generated synthetic dataset simply allows changing any of statistic property of the dataset with keeping all other statistic properties. In the paper, we present a procedure for generation of sequence databases with taxonomies for an evaluation of hierarchical sequential pattern mining algorithms.
Evaluation is an important part of algorithm design. Algorithms are typically evaluated on real-world and synthetic datasets. Real-world datasets are appropriate for evaluation of algorithm properties in practice but it is difficult to change the dataset to have some particular statistics, e.g. number of input items. In contrast, generated synthetic dataset simply allows changing any of statistic property of the dataset with keeping all other statistic properties. In the paper, we present a procedure for generation of sequence databases with taxonomies for an evaluation of hierarchical sequential pattern mining algorithms.
@INPROCEEDINGS{FITPUB10435, author = "Michal \v{S}ebek and Jaroslav Zendulka", title = "Generator of Synthetic Datasets for Hierarchical Sequential Pattern Mining Evaluation", pages = "289--292", booktitle = "Proceedings of the Twelfth International Conference on Informatics 2013", year = 2013, location = "Ko\v{s}ice, SK", publisher = "The University of Technology Ko\v{s}ice", ISBN = "978-80-8143-127-2", language = "english", url = "https://www.fit.vut.cz/research/publication/10435" }