Course details
Knowledge Discovery in Databases
ZZN Acad. year 2012/2013 Winter semester 5 credits
Basic concepts concerning knowledge discovery in data, relation of knowledge discovery and data mining. Data sources for knowledge discovery. Principles and techniques of data preprocessing for mining. Systems for knowledge discovery in data, data mining query languages. Data mining techniques association rules, classification and prediction, clustering. Mining unconventional data - data streams, time series and sequences, graphs, spatial and spatio-temporal data, multimedia. Text and web mining. Working-out a data mining project by means of an available data mining tool.
Guarantor
Language of instruction
Completion
Time span
- 39 hrs lectures
- 13 hrs projects
Department
Subject specific learning outcomes and competences
Students get a broad, yet in-depth overview of the field of data mining and knowledge discovery. They are able both to use and to develop knowledge discovery tools.
Learning objectives
To familiarize students with knowledge discovery in data sources, to explain useful knowledge types and the steps of the knowledge discovery process, and to familiarize them with techniques, algorithms and tools used in the process.
Prerequisite knowledge and skills
There are no prerequisites
Fundamental literature
- Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Third Edition. Morgan Kaufmann Publishers, 2012, 703 p., ISBN 978-0-12-381479-1.
- Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Second Edition. Elsevier Inc., 2006, 770 p., ISBN 1-55860-901-3.
Syllabus of lectures
- Introduction - motivation, fundamental concepts, data source and knowledge types.
- Data Warehouse and OLAP Technology for knowledge discovery.
- Data Preparation - methods.
- Data Preparation - characteristics of data.
- Mining frequent patterns and associations - basic concepts, efficient and scalable frequent itemset mining methods.
- Multi-level association rules, association mining and correlation analysis, constraint-based association rules.
- Classification and prediction - basic concepts, decision tree, Bayesian classification, rule-based classification.
- Classification by means of neural networks, SVM classifier, other classification methods, prediction.
- Cluster analysis - basic concepts, types of data in cluster analysis, partitioning and hierarchical methods. Other clustering methods.
- Introduction to mining data stream, time-series and sequence data.
- Introduction to mining in graphs, spatio-temporal data and moving object data.
- Mining in biological data.
- Text mining, mining the Web.
Progress assessment
Duty credit consists of working-out the project and of obtaining at least 24 points for activities during semester.
Controlled instruction
A mid-term test, formulation of a data mining task, presentation of the project. The minimal number of points which can be obtained from the final exam is 20. Otherwise, no points will be assigned to the student.
Course inclusion in study plans