송.수신 이메일의 학습을 통해 긍정 오류를 줄이는 개선된 베이지안 필터링 기법

송.수신 이메일의 학습을 통해 긍정 오류를 줄이는 개선된 베이지안 필터링 기법

ㆍ 저자명: 김두환,유종덕,정수환,Kim. Doo-Hwan,You. Jong-Duck,Jung. Sou-Hwan
ㆍ 간행물명: 情報保護學會論文誌
ㆍ 권/호정보: 2008년|18권 2호|pp.129-137 (9 pages)
ㆍ 발행정보: 한국정보보호학회
ㆍ 파일정보: 정기간행물|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

본 논문에서는 기존의 베이지안 필터링 방식에서 발생하는 긍정 오류를 줄이기 위한 개선된 베이지안 필터링 기법을 제안한다. 기존의 베이지안 필터링 방식에서는 이메일 서버에서 학습한 DB를 일괄적으로 개별 사용자들에게 적용한다. 또한 수신 이메일 위주의 학습 방식은 양질의 정상 DB를 학습하는데 어려움을 준다. 이러한 문제로 인해 기존의 베이지안 필터링 기법에서는 정상 이메일을 스팸 이메일로 판단하는 긍정 오류가 발생한다. 제안 기법에서는 사용자의 송신 이메일을 양질의 정상 DB 정보로 판단하여 베이지안 정상 DB에 자동으로 학습한다. 뿐만 아니라 개별 사용자에게 독립적인 베이지안 DB를 제공하여 사용자 개개인의 이메일 송 수신 특성을 고려한 필터링 서비스를 제공한다. 제안 기법은 기존의 베이지안 필터링 기법보다 필터링의 정확성에서 평균 3.13% 향상된 결과를 보인다.

기타언어초록

In this paper, we propose an improved Bayesian Filtering mechanism to reduce the False Positives that occurs in the existing Bayesian Filtering mechanism. In the existing Bayesian Filtering mechanism, the same Bayesian Filtering DB trained at the e-mail server is applied to each e-mail user. Also, the training method using receiving e-mails only could not provide the high quality of ham DB. Due to these problems, the existing Bayesian Filtering mechanism can produce the False Positives which misclassify the ham e-mails into the spam e-mails. In the proposed mechanism, the sending e-mails of the user are treated as the high quality of ham information, and are trained to the Bayesian ham DB automatically. In addition, by providing a different Bayesian DB to each e-mail user respectively, more efficient e-mail filtering service is possible. Our experiments show the improvement of filtering accuracy by 3.13%, compared to the existing Bayesian Filtering mechanism.

키워드

Bayesian Filtering Mechanism False Positives Ham DB Sending E-mails Train

다운URL