中文说明:在matlab平台上,针对多周期报童问题,采用值迭代算法、策略迭代算法和强化学习算法求解MDP模型的实例
English Description:
On the MATLAB platform, for the multi period newsboy problem, the value iterative algorithm, strategy iterative algorithm and reinforcement learning algorithm are used to solve the MDP model