杂志名称:The Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics

类型:EI

申请时间:2014-8-23 00:00

Nowadays, the partially observable Markov decision processes (POMDPs) is widely used in many fields. The solutions to POMDP suffer from prohibitive computational complexity due to curse of dimensionality, but MCVI for POMDP is envisioned as a promising approach to break the curse. Although MCVI is a great breakthrough toward solving this problem, it still has some defects, such as the slow convergence rate and the continuous growth of nodes’ number of policy graph. To this end, the purpose of this paper is to provide a fast MCVI based on improved NSGA2. Different from the general NSGA2, the improved NSGA2 initializes the population by experiential knowledge and uses a self-adjustable value as the probability of cross and mutation. Before executing the MCVI, the algorithm will set a series of thresholds. When the algorithm gets a temporary policy graph which reaches one of the thresholds, it will use a discount operator to update the threshold and use the improved NSGA2 to update policy graph. After that, the algorithm will execute the MCVI again and repeat this process until the end. Numerical experiments show that the fast MCVI achieves about 8% increase in convergence rate over original MCVI, and about 60% decrease in nodes’ number of policy graph, for the classic problem of corridor. |

发表评论
###
最新评论(*0*)

( 蜀ICP备17031667号-4 )

GMT+8, 2020-11-28 12:12 , Processed in 0.029175 second(s), 14 queries , Gzip On.

Powered by **网络传输与控制研究室** Licensed

© 2016-2099