Document Type
Article
Publication Date
Spring 4-8-2009
Abstract
This article surveys recent research in the area of language modeling (sometimes called statistical language modeling) approaches to information retrieval. Language modeling is a formal probabilistic retrieval framework with roots in speech recognition and natural language processing. The underlying assumption of language modeling is that human language generation is a random process; the goal is to model that process via a generative statistical model.
In this article, we discuss current research in the application of language modeling to information retrieval, the role of semantics in the language modeling framework, cluster-based language models, use of language modeling for XML retrieval and future trends.
Recommended Citation
Banerjee, P., & Han, H. (2009). Language modeling approaches to information retrieval. JCSE, 3(3), 143-164.
Comments
The version of record is available at http://jcse.kiise.org/posting/3-3/jcse_3-3_48.pdf. Published by the Korean Institute of Information Scientists and Engineers (KIISE). Copyright © 2009, KIISE and the authors. Creative Commons Attribution License.