강화학습을 이용한 n-Queen 문제의 수렴속도 향상

강화학습을 이용한 n-Queen 문제의 수렴속도 향상

ㆍ 저자명: 임수연,손기준,박성배,이상조,Lim. SooYeon,Son. KiJun,Park. SeongBae,Lee. SangJo
ㆍ 간행물명: 퍼지 및 지능시스템학회 논문지
ㆍ 권/호정보: 2005년|15권 1호|pp.1-5 (5 pages)
ㆍ 발행정보: 한국지능시스템학회
ㆍ 파일정보: 정기간행물|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

강화학습(Reinforcement-Learning)의 목적은 환경으로부터 주어지는 보상(reward)을 최대화하는 것이며, 강화학습 에이전트는 외부에 존재하는 환경과 시행착오를 통하여 상호작용하면서 학습한다 대표적인 강화학습 알고리즘인 Q-Learning은 시간 변화에 따른 적합도의 차이를 학습에 이용하는 TD-Learning의 한 종류로서 상태공간의 모든 상태-행동 쌍에 대한 평가 값을 반복 경험하여 최적의 전략을 얻는 방법이다. 본 논문에서는 강화학습을 적용하기 위한 예를 n-Queen 문제로 정하고, 문제풀이 알고리즘으로 Q-Learning을 사용하였다. n-Queen 문제를 해결하는 기존의 방법들과 제안한 방법을 비교 실험한 격과, 강화학습을 이용한 방법이 목표에 도달하기 위한 상태전이의 수를 줄여줌으로써 최적 해에 수련하는 속도가 더욱 빠름을 알 수 있었다.

기타언어초록

The purpose of reinforcement learning is to maximize rewards from environment, and reinforcement learning agents learn by interacting with external environment through trial and error. Q-Learning, a representative reinforcement learning algorithm, is a type of TD-learning that exploits difference in suitability according to the change of time in learning. The method obtains the optimal policy through repeated experience of evaluation of all state-action pairs in the state space. This study chose n-Queen problem as an example, to which we apply reinforcement learning, and used Q-Learning as a problem solving algorithm. This study compared the proposed method using reinforcement learning with existing methods for solving n-Queen problem and found that the proposed method improves the convergence rate to the optimal solution by reducing the number of state transitions to reach the goal.

키워드

강화학습 에이전트 보상 상태전이 Q-Learning

다운URL