DSpace at EWHA: Predictive Analysis with Recurrent Neural Network using Synthetic Time Series Data generated from Autoencoder

Browse

My Repository

DSpace at EWHA일반대학원 통계학과 Theses_Master

View : 428 Download: 0

Predictive Analysis with Recurrent Neural Network using Synthetic Time Series Data generated from Autoencoder

Title: Predictive Analysis with Recurrent Neural Network using Synthetic Time Series Data generated from Autoencoder

Authors: 이채영

Issue Date: 2022

Department/Major: 대학원 통계학과

Publisher: 이화여자대학교 대학원

Degree: Master

Advisors: 안재윤

Abstract: 데이터 복제는 보험 분야와 금융 분야 등 다양한 분야에서 필수적이다. 특히 보험분야에서 가입자들의 정보를 이용하는 경우 개인정보보호법에 의한 규제로 인해 문제시될 수 있다. 이러한 경우 원 데이터의 특성을 가지고 있지만 개인의 민 감한 정보를 포함하지 않는 복제 데이터가 하나의 대안이 될 수 있다. 본 논문에서는 오토 인코더를 사용하여 복제된 시계열 데이터를 바탕으로 사고 건수를 예측하고, 원 데이터와 복제된 데이터의 예측 성능을 비교한다. 보험 가입자들의 사고 건수는 시계열 자료의 특성을 가지기에 RNN을 사용하여 다음 년도의 사고 건수를 예측한다. 오토 인코더의 활성화 함수와 노드 수를 변화해가며 데이터 복제에 최적인 오토 인코더를 찾았으며, 이러한 모델에서의 변화가 전체 성능에 큰 영향을 주지는 않는다는 결론을 내릴 수 있었다.;Data duplication is important in various fields of industry, including insurance and financial sectors. Especially, in insurance, it can be complicated to use data of policyholders mainly due to a privacy issue. Alternatively, one may produce the synthetic data which resemble the original data but does not include sensitive information. Autoencoder (AE) is one of the most popular way to synthesize data. In this thesis, we use AE to synthesize time series data of historical claim data. In the simulation study, we show the predictive power of synthesized data using recurrent neural networks. By changing an activation function and the number of nodes in simulation, we discover the best structure for data synthetization. Since predicting the number of claim is crucial in insurance field, recurrence neural network(RNN) is used to forecast the number of accidents.