DSpace at EWHA: Multi-Item Inventory Control Model with a Budget Constraint using Reinforcement Learning

Browse

My Repository

DSpace at EWHA일반대학원 빅데이터분석학협동과정 Theses_Master

View : 790 Download: 0

Multi-Item Inventory Control Model with a Budget Constraint using Reinforcement Learning

Title: Multi-Item Inventory Control Model with a Budget Constraint using Reinforcement Learning

Other Titles: 강화학습을 이용한 예산 제약조건의 다중항목 재고관리 모델

Authors: LAU, XIA JIUN

Issue Date: 2021

Department/Major: 대학원 빅데이터분석학협동과정

Publisher: 이화여자대학교 대학원

Degree: Master

Advisors: 민대기

Abstract: Finding the optimum order quantity for the inventory control model is an important mechanism in the supply chain management, where the amount of order quantity plays an important role to maintain high profitability in the company. Budget constrained inventory control model is one of the scenarios in supply chain management where the goal is to minimize the total inventory cost under a preset budget constraint. However, some of the problems of the supply chain such as the bullwhip effect can affect the company profit severely. Traditionally, heuristic approach was used to solve this problem, but the results are not optimal. Therefore, to overcome these challenges, we used the variant of reinforcement learning which is Q-Learning to solve the inventory control problem in order to find the optimum order quantity that minimizes the total inventory cost. Since our problem also involves budget constraint, a quadratic program OptLayer is used to find the optimal solutions that satisfy our goal. We also came out with other approaches, namely Q-Learning model without using OptLayer, heuristic approach and Economic Order Quantity (EOQ) approach, so that we can compare our results. Our framework shows promising results based on the experimental evaluations.;최적주문량을 찾는 것은 재고관리 모형의 중요한 메커니즘으로 최적주문량이 회사의 높은 수익성을 유지하는데 중요한 역할을 한다. 예산이 제약되는 재고관리 모델은 예산이 제약돼있는 상황에서 총 재고 비용을 최소화하는 것을 목표로 하는 공급망 관리(Supply Chain Management) 중 하나이다. 그러나 공급망 관리 문제 중에는 회사 이익에 심각한 영향을 미칠 수 있는 채찍효과(Bullwhip effect)가 있다. 전통적으로 이 문제를 해결하기 위해 휴리스틱 기법을 사용했지만 최고의 결과를 찾지 못했다. 따라서 이러한 문제를 해결하고 최적주문량을 찾기 위해 Q-Learning이라는 강화학습 기법을 이용하였고 예산이 제약되는 재고관리 문제를 이차계획법(Quadratic Programming) OptLayer로 목표를 충족시키는 최적의 해결책을 찾기 위해 사용된다. 이것이 본 연구에서 사용되는 모델이다. 또한, 이 모델과 비교하기 위해 OptLayer를 사용하지 않는 Q-Learning 모델, 휴리스틱 기법 및 경제적 주문량 모형(EOQ)도 같이 실험했다. 여러 모델 중 결과적으로 이번 연구에서 제시한 모델이 제일 좋은 성능을 보여줬다.