Thesis Details

Speaker Verification without Feature Extraction

Master's Thesis Student: Lukáč Peter Academic Year: 2020/2021 Supervisor: Mošner Ladislav, Ing.
Czech title
Verifikace osob podle hlasu bez extrakce příznaků
Language
English
Abstract

Speaker verification is a field that is still improving its state of the art (SotA) and tries to meet the demands of its use in speaker authentication systems, forensic applications, etc. The improvements are made by the advancements in deep learning, the creation of new training and testing datasets and various speaker recognition challenges and speech workshops. In this thesis, we will explore models for speaker verification without feature extraction. Inputting the models with raw speaker waveform simplifies the pipeline of the systems, thus saving computational and memory resources and reducing the number of hyperparameters needed for creating the features from waveforms that affect the results. Currently, the models without feature extraction do not achieve the performance of the models with feature extraction. By applying various techniques to the models we will try to improve the baseline performance of the current models without feature extraction. The experiments with SotA techniques improved the performance of a model without feature extraction considerably however we still did not achieve the performance of a SotA model with feature extraction. However, the improvement is considerable enough so that we can use the improved model in a fusion with feature extraction model. We also discussed the experimental results and proposed improvements that aim to solve discovered limitations.

Keywords

speaker verification, featureless extraction, speaker embedding, residual network, RawNet, VoxCeleb1, VoxCeleb2, VoxSRC, Feature Map Scaling, SincNet, Additive Angular Margin loss, fusion

Department
Degree Programme
Information Technology and Artificial Intelligence, Specialization Sound, Speech and Natural Language Processing
Files
Status
defended, grade A
Date
24 June 2021
Reviewer
Committee
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT), předseda
Bařina David, Ing., Ph.D. (DCGM FIT BUT), člen
Beran Vítězslav, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Herout Adam, prof. Ing., Ph.D. (DCGM FIT BUT), člen
Lengál Ondřej, Ing., Ph.D. (DITS FIT BUT), člen
Zemčík Pavel, prof. Dr. Ing. (DCGM FIT BUT), člen
Citation
LUKÁČ, Peter. Speaker Verification without Feature Extraction. Brno, 2021. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2021-06-24. Supervised by Mošner Ladislav. Available from: https://www.fit.vut.cz/study/thesis/23746/
BibTeX
@mastersthesis{FITMT23746,
    author = "Peter Luk\'{a}\v{c}",
    type = "Master's thesis",
    title = "Speaker Verification without Feature Extraction",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2021,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/23746/"
}
Back to top