This paper studies the multi-item stochastic capacitated lot-sizing problem with stationary demand to minimise set-up, holding, and backorder costs. is a common in industry, concerning both inventory management production planning. We study applicability of Proximal Policy Optimisation (PPO) algorithm this problem, which type Deep Reinforcement Learning (DRL). The modelled as Markov Decision Pr...