시맨틱 텍스트 마이닝을 위한 온톨로지 활용 방안

시맨틱 텍스트 마이닝을 위한 온톨로지 활용 방안

ㆍ 저자명: 유은지,김정철,이춘열,김남규,Yu. Eun-Ji,Kim. Jung-Chul,Lee. Choon-Youl,Kim. Nam-Gyu
ㆍ 간행물명: 정보시스템연구
ㆍ 권/호정보: 2012년|21권 3호|pp.137-161 (25 pages)
ㆍ 발행정보: 한국정보시스템학회
ㆍ 파일정보: 정기간행물|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

The increasing interest in big data analysis using various data mining techniques indicates that many commercial data mining tools now need to be equipped with fundamental text analysis modules. The most essential prerequisite for accurate analysis of text documents is an understanding of the exact semantics of each term in a document. The main difficulties in understanding the exact semantics of terms are mainly attributable to homonym and synonym problems, which is a traditional problem in the natural language processing field. Some major text mining tools provide a thesaurus to solve these problems, but a thesaurus cannot be used to resolve complex synonym problems. Furthermore, the use of a thesaurus is irrelevant to the issue of homonym problems and hence cannot solve them. In this paper, we propose a semantic text mining methodology that uses ontologies to improve the quality of text mining results by resolving the semantic ambiguity caused by homonym and synonym problems. We evaluate the practical applicability of the proposed methodology by performing a classification analysis to predict customer churn using real transactional data and Q&A articles from the "S" online shopping mall in Korea. The experiments revealed that the prediction model produced by our proposed semantic text mining method outperformed the model produced by traditional text mining in terms of prediction accuracy such as the response, captured response, and lift.

키워드

Classification Data Mining Ontology Semantic Text Mining

다운URL