정보검색 기법과 동적 보간 계수를 이용한 N-gram 언어모델의 적응

정보검색 기법과 동적 보간 계수를 이용한 N-gram 언어모델의 적응

ㆍ 저자명: 최준기,오영환,Choi. Joon Ki,Oh. Yung-Hwan
ㆍ 간행물명: 말소리
ㆍ 권/호정보: 2005년|56권 1호|pp.207-223 (17 pages)
ㆍ 발행정보: 대한음성학회
ㆍ 파일정보: 정기간행물|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

The goal of language model adaptation is to improve the background language model with a relatively small adaptation corpus. This study presents a language model adaptation technique where additional text data for the adaptation do not exist. We propose the information retrieval (IR) technique with N-gram language modeling to collect the adaptation corpus from baseline text data. We also propose to use a dynamic language model interpolation coefficient to combine the background language model and the adapted language model. The interpolation coefficient is estimated from the word hypotheses obtained by segmenting the input speech data reserved for held-out validation data. This allows the final adapted model to improve the performance of the background model consistently The proposed approach reduces the word error rate by $13.6\%$ relative to baseline 4-gram for two-hour broadcast news speech recognition.

키워드

Language model adaptation Language model adaptation corpus Dynamic interpolation coefficient Linear interpolation Speech recognition

다운URL