DSpace at EWHA: A Study on the Effectiveness of a Confidence-based Selection Approach in Ensemble Modeling for Bankruptcy Prediction

Browse

My Repository

DSpace at EWHA일반대학원 경영학과 Theses_Ph.D

View : 1028 Download: 0

A Study on the Effectiveness of a Confidence-based Selection Approach in Ensemble Modeling for Bankruptcy Prediction

Title: A Study on the Effectiveness of a Confidence-based Selection Approach in Ensemble Modeling for Bankruptcy Prediction

Other Titles: 부도 예측을 위한 앙상블 모델링에서 확신 기반 선택 기법의 효과에 관한 연구

Authors: 김나라

Issue Date: 2013

Department/Major: 대학원 경영학과

Publisher: 이화여자대학교 대학원

Degree: Doctor

Advisors: 신경식

Abstract: The prediction model is the main factor affecting the performance of a knowledge-based system for bankruptcy prediction. Earlier studies on prediction modeling have focused on the building of a single best model using statistical and artificial intelligence techniques. However, since the mid-1980s, integration of multiple techniques (hybrid techniques) and, by extension, combinations of the outputs of several models (ensemble techniques) have, according to the experimental results, generally outperformed individual models. An ensemble is a technique that constructs a set of multiple models, combines their outputs, and produces one final prediction. The way in which the outputs of ensemble members are combined is one of the important issues affecting prediction accuracy. A variety of combination schemes have been proposed in order to improve prediction performance in ensembles. Each combination scheme has advantages and limitations, and can be influenced by domain and circumstance. Accordingly, decisions on the most appropriate combination scheme in a given domain and contingency are very difficult. The ensemble employing confidence-based integration (Shin and Han, 1998), which measures confidence (the degree of trust in the class-label predictions of classifiers) using continuous-valued outputs of artificial neural networks, achieved better performance than the individual neural networks in bankruptcy prediction problems. However, it was limited in a pilot experiment, and cannot be applied to heterogeneous ensembles consisting of different kinds of models, because it was developed to combine the same kinds of models, three artificial neural networks. Accordingly, this dissertation evaluated its effectiveness for bankruptcy prediction through a number of experiments under various circumstances, and proposed a confidence-based selection approach that can measure confidence adjusted to the same level, even if ensemble members produce different types of continuous-valued outputs, and, therefore, can be applied to heterogeneous ensembles as well as homogeneous ensembles.;지식기반시스템의 구성 요소인 모델은 시스템의 성능에 영향을 끼치는 주요한 요인이다. 예측 모형의 개발에 있어 초기 연구들은 통계 기법 및인공지능 기법들을 이용하여 최고 실적을 가지는 단일 모델을 만드는데주력하였다. 1980 년대 중반 이후에는 다수 기술의 통합(하이브리드), 더 나아가, 다수 모델의 결과의 결합(앙상블) 기법이 수많은 실험들에서 개별 모델들보다 더 나은 결과를 보여왔다. 이러한 연구들은 모델의 실적을 증가시킴으로써 지식기반시스템의 궁극적인 성능을 강화하려는 노력의 일환이다. 앙상블 기법(Ensemble techniques)은 다수 모델들의 집합을 구성하고 그 출력값들을 결합하여 한 개의 최종 예측값을 산출한다. 앙상블은 다수 모델들의 출력값들을 결합함으로써 개별적인 모델들의 한계를 극복하여 최고 실적을 가지는 개별 모델보다 더 좋은 성과를 낼 수 있다. 앙상블을 구성하는 기본 모델인 앙상블 멤버들의 출력값들을 결합하는 결합기법은 앙상블의 예측 정확성에 영향을 끼치는 주요한 요인이다. 지금까지 다양한 결합기법들이 앙상블의 예측 정확도를 향상시키기 위해 제안되었다. 그 중에서도 부도 예측을 위한 앙상블에서 가장 많이 쓰이는 결합방법은 다수결, 평균, 가중평균이다. 그러나 이러한 결합방법들이 채택된 앙상블에서 앙상블 멤버들의 과반수가 틀리게 예측했을 때 최종 예측 결과는 틀리거나 또는 맞는 예측 결과를 내기가 어렵다. 즉, 맞게 예측한 소수 모델들의 의견이 무시되거나 최종의사결정에 약하게 반영될 수 있다. 이에 비하여 부도 예측을 위해 인공신경망 모델들의 연속형 출력값들을 사용하여 모델들에 의해 예측된 분류 종류에 대한 믿음의 정도를 의미하는 확신(confidence)을 측정하고 가장 높은 확신을 가지는 모델의 의사결정을 최종 예측값으로 선택하는 확신 기반의 통합(confidence-based integration) 접근법은 모델의 과반수가 틀리게 예측할 때 조차도 가장 높은 확신을 가지는 한 개의 모델이 옳게 예측한다면 옳은 예측결과를 산출할 수 있다. 그러나 이 방법은 인공신경망들로 이루어진 동종의 모델들을 결합하기 위해 개발되었기 때문에 다른 종류의 모델들에 의해 산출된 다른 레벨의 연속형 출력값들을 결합하지 못한다. 본 연구는 앙상블을 구성하는 모형들이 다른 유형의 연속형 출력값들을 산출하더라도 통일된 확신(confidence)을 측정함으로써 부도 예측을 위한 앙상블에서 기존 결합기법들의 한계를 극복할 수 있는 확신 기반 선택 기법(Confidence-based Selection Approach)을 제안하였다. 제안된 결합기법은 로지스틱 링크 함수를 사용하여 다른 유형의 숫자 출력값들을 확률적인 출력값들로 통일시킨 후 주어진 사례에 대한 각 앙상블 멤버의 분류 예측에 대한 확신을 측정하여 가장 높은 확신을 가지는 모델이 예측한 분류 종류를 앙상블의 최종 분류 의사결정으로 선택한다. 본 연구에서 제안 기법의 효과를 조사하기 위해 세 가지 종류, 총 여덟 개 세부종류의 상황에서 비교대상이 되는 네 가지 결합기법들과의 성과 비교가 시행되었고 연구결과 제안 기법이 다른 기법들에 비해 우수한 적중률을 보였다. 본 연구는 제안 기법을 통해 경영 분야에서 중요한 분류 문제인 부도 예측의 분류 실적을 향상시켰고, 이를 통해 경영 의사결정을 지원하는 지식기반시스템의 주요한 요소인 예측 모형 발전에 기여함으로써 기업의 궁극적인 성과를 향상시킬 수 있다.