Publication Details

SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels

KIŠŠ Martin, HRADIŠ Michal, BENEŠ Karel, BUCHAL Petr and KULA Michal. SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels. International Journal on Document Analysis and Recognition (IJDAR), vol. 2024, no. 27, 2023, pp. 177-193. ISSN 1433-2825. Available from: https://link.springer.com/article/10.1007/s10032-023-00452-9
Czech title
SoftCTC - semi-supervised učení pro rozpoznávání textu pomocí měkkých pseudo-labelů
Type
journal article
Language
english
Authors
Kišš Martin, Ing. (DCGM FIT BUT)
Hradiš Michal, Ing., Ph.D. (DCGM FIT BUT)
Beneš Karel, Ing. (DCGM FIT BUT)
Buchal Petr, Ing. (DCGM FIT BUT)
Kula Michal, Ing., Ph.D. (DCGM FIT BUT)
URL
Keywords

CTC, SoftCTC, OCR, Text recognition, Confusion networks

Abstract

This paper explores semi-supervised training for sequence tasks, such as optical character recognition or automatic speech recognition. We propose a novel loss function-SoftCTC-which is an extension of CTC allowing to consider multiple transcription variants at the same time. This allows to omit the confidence-based filtering step which is otherwise a crucial component of pseudo-labeling approaches to semi-supervised learning. We demonstrate the effectiveness of our method on a challenging handwriting recognition task and conclude that SoftCTC matches the performance of a finely tuned filtering-based pipeline. We also evaluated SoftCTC in terms of computational efficiency, concluding that it is significantly more efficient than a nave CTC-based approach for training on multiple transcription variants, and we make our GPU implementation public.

Published
2023
Pages
177-193
Journal
International Journal on Document Analysis and Recognition (IJDAR), vol. 2024, no. 27, ISSN 1433-2825
Book
International Journal on Document Analysis and Recognition
Publisher
Springer Verlag
DOI
UT WoS
001118969400001
EID Scopus
BibTeX
@ARTICLE{FITPUB12904,
   author = "Martin Ki\v{s}\v{s} and Michal Hradi\v{s} and Karel Bene\v{s} and Petr Buchal and Michal Kula",
   title = "SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels",
   pages = "177--193",
   booktitle = "International Journal on Document Analysis and Recognition",
   journal = "International Journal on Document Analysis and Recognition (IJDAR)",
   volume = 2024,
   number = 27,
   year = 2023,
   ISSN = "1433-2825",
   doi = "10.1007/s10032-023-00452-9",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/12904"
}
Back to top