Publication Details

Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model

DEORAS Anoop, MIKOLOV Tomáš, KOMBRINK Stefan and CHURCH Kenneth. Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model. Speech Communication, vol. 2012, no. 8, pp. 1-16. ISSN 0167-6393. Available from: http://www.sciencedirect.com/science/article/pii/S0167639312000969#

Czech title

Přibližná inference: podchycení složitých vztahů v jazykovém modelu pomocí techniky založené na vzorkování.

Type

journal article

Language

english

Authors

Deoras Anoop (JHU)
Mikolov Tomáš, Ing. (DCGM FIT BUT)
Kombrink Stefan, Dipl.-Inf -Ling (DCGM FIT BUT)
Church Kenneth (JHU)

URL

Keywords

Long-span language models; Recurrent neural networks; Speech recognition; Decoding

Abstract

This paper deals with approximate inference: a sampling based modeling technique to capture complex dependencies in a language model

Annotation

In this paper, we present strategies to incorporate long context information directly during the first pass decoding and also for the second pass lattice re-scoring in speech recognition systems. Long-span language models that capture complex syntactic and/or semantic information are seldom used in the first pass of large vocabulary continuous speech recognition systems due to the prohibitive increase in the size of the sentence-hypotheses search space. Typically, n-gram language models are used in the first pass to produce N-best lists, which are then re-scored using long-span models. Such a pipeline produces biased first pass output, resulting in sub-optimal performance during re-scoring. In this paper we show that computationally tractable variational approximations of the long-span and complex language models are a better choice than the standard n-gram model for the first pass decoding and also for lattice re-scoring.

Published

2012

Pages

1-16

Journal

Speech Communication, vol. 2012, no. 8, ISSN 0167-6393

Book

Speech Communication

Publisher

Elsevier Science

DOI

10.1016/j.specom.2012.08.004

UT WoS

000312422900013

EID Scopus

2-s2.0-84870293590

BibTeX

@ARTICLE{FITPUB10160,
   author = "Anoop Deoras and Tom\'{a}\v{s} Mikolov and Stefan Kombrink and Kenneth Church",
   title = "Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model",
   pages = "1--16",
   booktitle = "Speech Communication",
   journal = "Speech Communication",
   volume = 2012,
   number = 8,
   year = 2012,
   publisher = "Elsevier Science",
   ISSN = "0167-6393",
   doi = "10.1016/j.specom.2012.08.004",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/10160"
}