Thesis Details
Aktivní učení pro rozpoznávání textu
The aim of this Master's thesis is to design methods of active learning and to experiment with datasets of historical documents. A large and diverse dataset IMPACT of more than one million lines is used for experiments. I am using neural networks to check the readability of lines and correctness of their annotations. Firstly, I compare architectures of convolutional and recurrent neural networks with bidirectional LSTM layer. Next, I study different ways of learning neural networks using methods of active learning. Mainly I use active learning to adapt neural networks to documents that the neural networks do not have in the original training dataset. Active learning is thus used for picking appropriate adaptation data. Convolutional neural networks achieve 98.6\% accuracy, recurrent neural networks achieve 99.5\% accuracy. Active learning decreases error by 26\% compared to random pick of adaptations data.
Active learning, text recognition, neural networks, convolutional neural networks, recurrent neural networks, dataset IMPACT
Beran Vítězslav, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Horák Aleš, doc. RNDr., Ph.D. (FI MUNI), člen
Hrubý Martin, Ing., Ph.D. (DITS FIT BUT), člen
Janoušek Vladimír, doc. Ing., Ph.D. (DITS FIT BUT), člen
Rozman Jaroslav, Ing., Ph.D. (DITS FIT BUT), člen
@mastersthesis{FITMT22021, author = "Jan Koh\'{u}t", type = "Master's thesis", title = "Aktivn\'{i} u\v{c}en\'{i} pro rozpozn\'{a}v\'{a}n\'{i} textu", school = "Brno University of Technology, Faculty of Information Technology", year = 2019, location = "Brno, CZ", language = "czech", url = "https://www.fit.vut.cz/study/thesis/22021/" }