DSpace at EWHA: Local Explanation Generation Methods for Machine Learning-Based Bankruptcy Prediction Model

Browse

My Repository

DSpace at EWHA일반대학원 빅데이터분석학협동과정 Theses_Ph.D

View : 542 Download: 0

Local Explanation Generation Methods for Machine Learning-Based Bankruptcy Prediction Model

Title: Local Explanation Generation Methods for Machine Learning-Based Bankruptcy Prediction Model

Other Titles: 머신러닝 기반 부도예측 모형의 설명력 증진을 위한 국소적 방법

Authors: 조수현

Issue Date: 2022

Department/Major: 대학원 빅데이터분석학협동과정

Publisher: 이화여자대학교 대학원

Degree: Doctor

Advisors: 신경식

Abstract: In recent years, there has been a number of research on machine learning techniques and their applications to real-world problems. Artificial intelligence techniques, including machine learning, have proven to have an outstanding performance leading to increasing expectations on the applicability of such methods in various domains and industries such as finance, medicine, and computer vision. However, the intrinsic trait of the high-performing machine learning techniques is the obscurity of the learning algorithm. They are more accurate than conventional statistical models, yet why and how the model yielded such output is difficult to understand. An accuracy-interpretability tradeoff refers to this phenomenon where the high-performing models (e.g., artificial neural networks, support vector machines) suffer from the lack of explanation for their prediction, and interpretable models (e.g., linear regression) suffer inferior performance. Especially in the financial domain, there have been attempts to implement machine learning techniques. However, as most machine learning techniques are ‘black-box,’ it is impossible to provide explanations about the model and its prediction. For this reason, explainable artificial intelligence (XAI) has become an important research issue to overcome this drawback. In this paper, we focus on explanation at the local level via counterfactual examples and rules for the bankruptcy prediction model. A bankruptcy prediction model is essentially a credit scoring for companies and is often used as a knowledge base for credit scoring models used in financial institutions. In particular, prediction models in finance must offer a justifiable and acceptable explanation for their predictions to gain the user’s acceptance and trust in the model. To accelerate the adoption of the machine learning-based prediction models into practice, we believe the user’s acceptance and trust toward the model are integral to implementing and utilizing the model in the field. In this dissertation, we propose three methods to generate explanations for the bankruptcy prediction model at the local level to offer both relevant and feasible explanations. First, we introduce relevance enhanced counterfactual example-based explanations by incorporating feature importance. Second, as an extension of the first method, we further improve the method by adding a feasibility factor to the explanation by incorporating domain knowledge. Third, we propose a feasible rule-based explanation for the bankruptcy prediction model that provides explanations complying with the domain knowledge. ;신용 리스크 관리에 해당하는 부도예측모형은 기업에 대한 신용평가라고도 볼 수 있으며 은행을 비롯한 금융기관의 신용평가모형의 기본 지식베이스(knowledge base)로 이용되면서 오랜 기간 연구되고 있다. 최근에는 부도예측 모형은 새로운 머신러닝 기법을 비롯한 인공지능 기술을 접목할 수 있는 유망한 분야로 손꼽히고 있다. 그러나 머신러닝 기법 기반의 모형은 학습과 결과에 대한 설명이 어렵다는 특성이 있다. 부도예측모형의 경우, 모형의 사용자인 금융전문가 혹은 고객에게 모형의 결과에 대한 설명이 요구되는 분야로 설명력이 없는 모형은 실제로 도입되고 사용자들에게 수용되기에는 어려움이 있다. 이에 본 연구는 불투명성(obscurity)이라는 한계를 가진 머신러닝 기반의 부도예측모형에 인간 친화적인 설명을 제공하여 모형에 설명력을 제고하는 것을 목적으로 한다. 모형의 결과에 대한 설명은 모형의 사용자에게 제공되는 것으로 사용자가 납득할 수 있는 설명을 제공하는 것이 모형에 대한 신뢰와 수용을 증진시킬 수 있다. 또한 사용자의 관점에서 해석 가능하고 실용적으로 응용될 수 있는 설명을 생성하기 위하여 본 연구에서는 국소적 수준(local level)에서 개별 인스턴스에 대하여 관련성과 현실적 타당성을 고려한 설명을 생성하는 방법을 제안한다. 첫번째 제안방법은 반사실적 예시 기반 설명방법(counterfactual example-based explanation)으로 부도예측모형의 변수 중요도를 설명 생성 알고리즘에 반영하여 제공되는 설명의 관련성을 향상시키는 모형으로 관련성 향상과 동시에 설명의 질을 확보하기 위해 유전알고리즘에 다목적함수를 이용한다. 또한 반사실적 설명을 생성하기 위하여 유전알고리즘을 이용한다. 두번째 제안방법은 첫번째 제안방법에 더하여 설명의 현실적 타당성을 고려한 방법으로 부도예측모형에 이용된 재무변수의 특성을 설명을 생성하는 알고리즘에 통합한다. 이를 통해 유전알고리즘을 이용한 반사실적 설명이 재무변수의 방향성이 기존 재무건전성과 관련한 도메인 지식과 일치하여 현실적 타당성을 확보하도록 한다. 세번째 제안방법은 로컬 대리모델로 연관규칙 알고리즘을 이용하여 설명대상에 대한 규칙기반 설명(rule-based explanation)을 생성한다. 연관규칙으로 도출된 설명에 재무변수 특성에 기반한 도메인 지식을 통합하고 제공되는 규칙의 사실적 혹은 반사실적 정보에 따라 네 가지로 구분하여 제공한다. 본 연구에서는 제안방법들에 대하여 실제 부도 데이터를 이용하여 검증하고 있으며 설명의 타당성에 대하여서는 사용자 조사를 통하여 질적 평가를 함께 수행하였다.