DSpace at EWHA: 양방향 LSTM을 적용한 단어 의미 중의성 해소 감정 분석

Browse

My Repository

DSpace at EWHA일반대학원 빅데이터분석학협동과정 Theses_Master

View : 955 Download: 0

양방향 LSTM을 적용한 단어 의미 중의성 해소 감정 분석

Title: 양방향 LSTM을 적용한 단어 의미 중의성 해소 감정 분석

Other Titles: Emotion Analysis Using a Bidirectional LSTM for Word Sense Disambiguation

Authors: 기호연

Issue Date: 2020

Department/Major: 대학원 빅데이터분석학협동과정

Publisher: 이화여자대학교 대학원

Degree: Master

Advisors: 신경식

Abstract: 최근, SNS(Social Network Service) 이용률이 늘어남에 따라 많은 사람들이 SNS 상에서 자신의 일상과 감정을 공유하고 심리를 표출하고 있다. SNS 사용자의 감정을 이해할 수 있다면 활용할 수 있는 기회가 많으며, 현재 각 기업은 감정을 소비 욕구로 연결시키고자 하는 적극적 마케팅 활동을 펼치고 있다. 이렇듯, 감정을 이해하는 일은 중요하며, 관련 연구도 끊임없이 등장하는 추세이다. 감정은 인간이 느끼는 심리적인 상태로 기쁨, 슬픔 등을 의미하는데, 인간의 행동에 막대한 영향력을 행사한다. 특히, 감정을 나타내는 어휘들은 인간의 심리를 투영시킨다는 점에서 구체적이고, 풍부한 맥락을 전달하여 어휘적 중의성(Lexical Ambiguity)를 띄기도 한다. 어휘적 중의성은 동음이의어(homonym)와 다의어(polysemy)로 정의하는데, 이를 해소하는 방법에는 지식 기반 학습(Knowledge-based Learning), 지도 학습(Supervised Learning), 비지도 학습(Unsupervised Learning)이 있다. 본 연구에서는 지도 학습을 적용하여 중의성을 해소한 감정 분류 모델을 제안한다. 주변 문맥의 정보를 충분히 반영한다면, 어휘적 중의성 문제를 해결하고, 문장이 나타내고자 하는 감정을 하나로 나타낼 수 있다는 가정을 기반으로 한다. 연구 모델로 양방향 LSTM(Bidirectional LSTM)을 사용하는데, 이 알고리즘은LSTM(Long Short-term Memory)에서 역방향으로 학습하는 은닉층을 추가하여 설계된 딥러닝 기법으로, 시퀀스를 순방향, 역방향으로 각각 학습해 시퀀스의 문맥 정보를 충분히 이해하고 의미적인 정보를 반영한다. 실제로, 양방향 LSTM은 문맥 정보를 필요로 하는 TTS (Text-to-Speech), Text Classification 등 다양한 자연어 처리 연구 분야에서 활용되고 있으며, 최근 발표된 문장 임베딩 기법들인 ELMo(Embeddings from Language Model), BERT(Bidirectional Encoder Representations from Transformers)에서도 양방향 LSTM을 기본 알고리즘으로 설계하고 있다. 본 연구에서도 양방향 LSTM 기법을 사용한 감정 분류 모델을 제안한다. 양방향 LSTM 알고리즘을 사용했을 때 감정 어휘의 중의성을 해소하여 감정 분류 성능을 높이는지 알아보기 위해 LSTM과 RNN 알고리즘을 적용한 모델과 비교 검증하였다. 또한, GloVe 임베딩을 본 연구 모델의 임베딩 층으로 사용하고, ReLU함수를 활성화 함수로 택하는 하이퍼튜닝을 통해 모델의 성능을 높이고자 하였다. 이를 통해, 본 연구에서 제안하는 프레임워크가 어휘적 중의성을 해소하고 감정 분류 성능을 높이는 데 기여함을 확인하였다. ;Recently, as the use of SNS increases, many people share their daily lives and emotions on SNS and express their psychology. There are many opportunities to utilize SNS users' emotions if they can understand them, and lots of companies are currently actively conducting active marketing activities to connect emotions to consumption needs. As such, understanding emotions is important, and related research is constantly appearing. Emotions are a psychological state that humans feel, which means joy, sadness, and so on, and exert enormous influence on human behavior. In particular, the vocabulary expressing emotions projects human psychology, and conveys a specific and rich context, resulting in lexical ambiguity. Lexical ambiguity is defined as homonym and polysemy and can be solved with knowledge-based learning, supervised learning, and unsupervised learning approach. This study aims to propose an emotion classification model that disambiguate word sense by applying supervised learning. It is based on the assumption that if the information of the surrounding context is fully reflected, the problem of lexical ambiguity can be solved and the emotions that the sentence wants to express can be expressed as one. This study uses bidirectional LSTM as an algorithm, which is a deep learning method designed to add a hidden layer that trains backward in the Long Short-term Memory, in which sequences are trained forward and backward, respectively, to fully understand the context information of sequences and reflect meaningful information. In fact, bidirectional LSTM is used in various natural language processing research fields, such as text-to-speech and text classification, which require contextual information. Also, the recently announced sentence embedding algorithms, ELMo(Embedding from Language Model) and BERT(Bidirectional Encoder Presentations from Transformers), are also designing bidirectional LSTM as a basic algorithm. This study also proposes an emotion classification model using bidirectional LSTM algorithm. To see if using bidirectional LSTM algorithm increases emotional classification performance by disambiguating word sense, the model applied with LSTM and RNN algorithms was compared and verified. In addition, this study improved the performance of the model through hyper-tuning as using GloVe embedding as the embedding layer of this model and the ReLU function as an activation function. Through this, we have observed that the framework proposed in this study contributes to resolving lexical ambiguity and improving emotion classification performance.