Course details

Data Analysis and Visualization in Python

IZV Acad. year 2024/2025 Winter semester 4 credits

The aim of the course is to acquaint students with the problems of data acquisition, processing, analysis and visualization using the cross-platform scripting language Python. It has a sophisticated ecosystem offering a rich spectrum of extension libraries, either in the form of native code or in terms of performance of efficient extensions implemented in C / C ++.
During the lectures students will learn Python constructs, methods of data acquisition, storage and manipulation, possibilities of advanced computations in numerical and symbolic level and visualization of acquired data. In this course, students will also gain an overview of the properties of techniques for advanced analysis of data dependencies and their applications for various data. Finally, Python will be expanded to include custom designs and techniques to effectively overcome the disadvantages of the interpreted language for performance-oriented applications. In the practical part (project), students will go through all stages of large data processing - from the design stage, through processing to subsequent analysis and visualization.

Exam prerequisites
Minimum of 50 points earned. At least 2 points from each part of the project.

Guarantor

Course coordinator

Language of instruction

Czech

Completion

Classified Credit

Time span

  • 26 hrs lectures
  • 13 hrs projects

Assessment points

  • 100 pts projects

Department

Lecturer

Instructor

Learning objectives

The aim of the course is to acquaint students with the issue of data acquisition, processing and analysis in Python. Within the course, Python will be introduced as a tool for efficient data manipulation.
Students will gain a general overview of basic and advanced methods of data analysis and basic and advanced aspects of Python, which they will learn to use with modern mathematical libraries and libraries for advanced data analysis and modeling. They will understand how the techniques implemented in these libraries work in general and learn what technique is appropriate for what data.
In addition to general knowledge of basic data processing techniques, the student will gain an overview of the effective execution of critical parts of the program, extension of the language with its own modules written in C / C ++ or the problematic of installing libraries in an isolated environment or containers.
At the end of the course students should understand how to effectively obtain, analyze and visualize data of various extent. The knowledge can then be used to solve non-trivial engineering and scientific tasks or to evaluate data for management and decision-making purposes.

Recommended prerequisites

Prerequisite knowledge and skills

Basic knowledge of imperative programming and algorithmization, knowledge of basic concepts and operations from linear algebra (vectors, work with matrices, linear operations, etc.) and statistics.

Study literature

  • Mark Pilgrim: Ponořme se do Pythonu 3 (ISBN: 978-80-904248-2-1, dostupné online)
  • Jake VanderPlas: Python Data Science Handbook (ISBN: 978-1-491-91205-8, dostupné v online zdrojích knihovny)
  • Samir Madhavan: Mastering Python for Data Science (ISBN: 978-17-843901-5-0)
  • Robert Johansson: Numerical Python (2019, ISBN: 978-1-4842-4245-2)

Syllabus of lectures

  1. Introduction to Language I
  2. Introduction to Language II
  3. Data acquisition and data persistence
  4. Effective implementation of operations over n-dimensional fields
  5. Tools for advanced data manipulation
  6. Basic approaches to data visualization
  7. Basic methods of data and data dependency analysis
  8. Advanced approaches to data visualization
  9. Advanced methods of data and data dependency analysis
  10. Work with image data and possibilities of data presentation
  11. Advanced operations over time series
  12. Symbolic domain calculations
  13. Code acceleration capabilities for HPC needs

Syllabus - others, projects and individual work of students

The aim of the project is to create a script that obtains data from publicly available sources, analyzes and presentes it in the form of a report. Project evaluation will take into account the quality of the code, the resulting analysis and the generated reports.

Progress assessment

The evaluation is based on individual implementation of a project whose implementation consists of three parts (data acquisition, data preprocessing and analysis, report generation). Each part will be evaluated separately. Students will receive feedback on their work, which they will incorporate into the final solution. The first two parts are submitted during the semester and can be awarded up to 20 points each. Up to 60 points can be earned for the final solution.

 

Schedule

DayTypeWeeksRoomStartEndCapacityLect.grpGroupsInfo
Tue lecture 1., 2., 3., 4., 7., 12., 13. of lectures E104 E112 08:0009:50224 3BIT xx Vašíček
Tue lecture 5., 6., 8., 9., 10., 11. of lectures E104 E112 08:0009:50224 3BIT xx Mrázek

Course inclusion in study plans

  • Programme BIT, 3rd year of study, Elective
  • Programme BIT (in English), 3rd year of study, Elective
Back to top