Publication Details

Residual Memory Networks in Language Modeling: Improving the Reputation of Feed-Forward Networks

BENEŠ Karel, BASKAR Murali K. and BURGET Lukáš. Residual Memory Networks in Language Modeling: Improving the Reputation of Feed-Forward Networks. In: Proceedings of Interspeeech 2017. Stockholm: International Speech Communication Association, 2017, pp. 284-288. ISSN 1990-9772. Available from: http://www.isca-speech.org/archive/Interspeech_2017/pdfs/1442.PDF

Czech title

Sítě s reziduální pamětí pro jazykové modelování: zlepšení reputace dopředných sítí

Type

conference paper

Language

english

Authors

Beneš Karel, Ing. (DCGM FIT BUT)
Baskar Murali K. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)

URL

Keywords

residual memory networks, feed-forward networks, language modeling

Abstract

The paper describes the residual memory networks in language modeling: Improving the Reputation of Feed-Forward Networks.

Annotation

We introduce the Residual Memory Network (RMN) architecture to language modeling. RMN is an architecture of feedforward neural networks that incorporates residual connections and time-delay connections that allow us to naturally incorporate information from a substantial time context. As this is the first time RMNs are applied for language modeling, we thoroughly investigate their behaviour on the well studied Penn Treebank corpus. We change the model slightly for the needs of language modeling, reducing both its time and memory consumption. Our results show that RMN is a suitable choice for small-sized neural language models: With test perplexity 112.7 and as few as 2.3M parameters, they out-perform both a much larger vanilla RNN (PPL 124, 8M parameters) and a similarly sized LSTM (PPL 115, 2.08M parameters), while being only by less than 3 perplexity points worse than twice as big LSTM.

Published

2017

Pages

284-288

Journal

Proceedings of Interspeech - on-line, vol. 2017, no. 8, ISSN 1990-9772

Proceedings

Proceedings of Interspeeech 2017

Conference

Interspeech Conference, Stockholm, SE

Publisher

International Speech Communication Association

Place

Stockholm, SE

DOI

10.21437/Interspeech.2017-1442

UT WoS

000457505000058

EID Scopus

2-s2.0-85039150529

BibTeX

@INPROCEEDINGS{FITPUB11578,
   author = "Karel Bene\v{s} and K. Murali Baskar and Luk\'{a}\v{s} Burget",
   title = "Residual Memory Networks in Language Modeling: Improving the Reputation of Feed-Forward Networks",
   pages = "284--288",
   booktitle = "Proceedings of Interspeeech 2017",
   journal = "Proceedings of Interspeech - on-line",
   volume = 2017,
   number = 08,
   year = 2017,
   location = "Stockholm, SE",
   publisher = "International Speech Communication Association",
   ISSN = "1990-9772",
   doi = "10.21437/Interspeech.2017-1442",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11578"
}