Publication Details
Joint Energy-Based Model for Robust Speech Classification System against Dirty-Label Backdoor Poisoning Attacks
Joshi Sonal
Li Henry
Thebaud Thomas
Villalba Lopez Jesus Antonio (JHU)
Khudanpur Sanjeev (JHU)
Dehak Najim (JHU)
joint energy-based model, poisoning attacks, speech commands classification, dirty-label backdoor
Our novel technique utilizes a Joint Energy-based Model (JEM) that integrates both discriminative and generative approaches to increase resistance against dirty-label backdoor attacks. Our approach is especially effective when the trigger is short or hardly perceivable. We simulate the attack on the Speech Commands Dataset consisting of 1 s audio clips. During training, we use JEM to model a view of the input implemented by a randomly selected 610 ms window. During inference, we combine all (40) possible views utilizing a generative part of JEM. The resulting system has slightly decreased accuracy but significantly increased resistance shown in multiple scenarios. Interestingly, replacing JEM with a standard discriminative model (Disc) provides increased resistance with a lesser effect compared to JEM but maintains accuracy. We introduce an extension motivated by semi-supervised training that further improves JEM but not Disc. JEM can also benefit from Gaussian noise during evaluation.
@INPROCEEDINGS{FITPUB13087, author = "Martin \v{S}\r{u}stek and Sonal Joshi and Henry Li and Thomas Thebaud and Antonio Jesus Lopez Villalba and Sanjeev Khudanpur and Najim Dehak", title = "Joint Energy-Based Model for Robust Speech Classification System against Dirty-Label Backdoor Poisoning Attacks", pages = "1--8", booktitle = "Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)", year = 2023, location = "Taipei, TW", publisher = "IEEE Signal Processing Society", ISBN = "979-8-3503-0689-7", doi = "10.1109/ASRU57964.2023.10389697", language = "english", url = "https://www.fit.vut.cz/research/publication/13087" }