DSpace at EWHA: A Study on Weighted Random Choice Techniques to Improve Diversity in Recommender Systems

Browse

My Repository

DSpace at EWHA일반대학원 빅데이터분석학협동과정 Theses_Ph.D

View : 230 Download: 0

A Study on Weighted Random Choice Techniques to Improve Diversity in Recommender Systems

Title: A Study on Weighted Random Choice Techniques to Improve Diversity in Recommender Systems

Other Titles: 추천 다양성 제고를 위한 가중 랜덤 선택 기법 연구

Authors: 신현섭

Issue Date: 2023

Department/Major: 대학원 빅데이터분석학협동과정

Publisher: 이화여자대학교 대학원

Degree: Doctor

Advisors: 신경식

Abstract: Artificial intelligence's learning ability based on large-scale data learning is being improved. Chatbot ChatGPT, developed by Open AI, an artificial intelligence research and development company, shows excellent performance in various fields such as coding, translation, and novel writing as well as answering simple questions. However, since the results are produced by learning based on existing data, it is highly likely that the bias or errors included in the existing data are contained as they are. The key to the recommender system, one of the various fields of artificial intelligence, is to accurately identify users' tastes and recommend the items they want. There are various algorithms used in recommender systems. There are ways to identify the characteristics of recommended items and recommend items similar to users' tastes, and ways to distinguish and recommend users with similar preferences. In addition, there are a method of classifying and recommending items that have received similar scores from users, and a method of recommending using a mixture of various recommendation algorithms. The recommender system also recommends new items based on existing ratings and feedback data. Although recommendation accuracy is improving with the development of algorithms, existing recommendation methods centered on accuracy inherent the perfunctory problem of recommendation. Existing recommender systems focused on improving accuracy are good at identifying patterns common to many users but have limitations in reflecting users' small but diverse tastes. In addition, repetitive recommendations of similar items and previously preferred items cause customers to leave and decrease satisfaction due to monotony of recommendations. Accordingly, in determining the performance of the recommender system, there is a need to consider not only the accuracy of recommendation, but also indicators that can be evaluated by recommending various items differentiated from existing preferences such as coverage, diversity, and novelty. This paper is a study on how to improve Inter-list Diversity, which allows recommendations to be made in a combination of various items for each request when a user requests recommendations several times. If users are not satisfied with a single recommendation, they want to receive a new item recommendation again. In addition, considering the characteristics of association rules analysis (Market basket analysis), it can be seen that which combination is recommended will also have an important effect on user choice and satisfaction. Users' tastes are diverse, and in order to propose various recommendation combinations suitable for each user's taste, the improvement of Inter-list Diversity must be made. Most of the diversity studies in the recommender system were mainly studies on Intra-user and Intra-list Diversity for single recommendations. Research on Inter-list Diversity, which has not been studied much, is expected to prevent customer churn, improve long-term satisfaction, and create a basis for recommending various items by avoiding the monotony of recommendation results through various combinations of recommendations for repeated recommendation requests. In this study, I applied a Weighted Random Choice technique to enhance Inter-list Diversity using MovieLens data, which is most commonly used in research on recommender systems. The basic Weighted Random Choice technique is a method that weights the user's preferred item genre or items with high expected ratings and is used as a method of ensuring Inter-list Diversity. However, while satisfying Inter-list Diversity, there was a limitation that could result in not satisfying the diversity within each list. Therefore, this study proposes various Weighted Random Choice techniques that can enhance Inter-list Diversity through Weighted Random Choice techniques while also enhancing Intra-list Diversity inside the list. The first technique proposed is the Sequential Weighted Random Choice technique, which is a method of recommending the top five items sequentially avoiding the genre of the previous item. The top five items are selected by weighting the items to be predicted according to the expected rating, selecting the first recommended item by applying the Weighted Random Choice technique, and then selecting the second recommended item by applying the Weighted Random Choice technique to items excluding items of the same genre as the first item. The Cluster-based Weighted Random Choice technique divides similar items into five groups and recommends them by applying a Weighted Random Choice technique one by one in each group, and the Bundled Weighted Random Choice technique is a method of applying a Weighted Random Choice technique by forming several bundles containing a specific number of genres. The Rare Item Weighted Random Choice technique is a method of applying a Weighted Random Choice technique by weighing movies including the least appeared genre when recommending the top five items, and the Time Trend Weighted Random Choice technique is a method of applying a Weighted Random Choice technique by giving weights to genres that have recently gained interest when recommending the top five items in consideration of time. Based on the five Weighted Random Choice techniques proposed as above, effective techniques were selected by analyzing the degree of diversity improvement, recommendation accuracy, and satisfaction of recommended items. With the advent of deep learning, predictive power has improved in many ways, but research on enhancing diversity has not yet been sufficiently conducted. This study is meaningful in that it has expanded the range of algorithms to enhance the diversity of recommender systems by proposing various ways to enhance Inter-list Diversity. To sum up, this study examined diversity problems in the field of recommender systems and proposed various Weighted Random Choice techniques to enhance Inter-list Diversity considering the diversity within the list. This study can help improve consumers' long-term satisfaction and present various items by increasing the diversity of the set of items recommended by the recommender systems. In addition, it has the advantage of being a method that can be used with only the movie genre, rating, and time information without the user's personal profile. It is expected that this study will serve as a basis for presenting more diversified directions in future research on ways to improve the diversity of recommender systems. In addition, this paper, which aims to improve diversity as well as improve accuracy through learning of existing data, is very timely in these days when many AI scientists who are concerned about problems caused by distortion and bias of existing data and deepening user confirmation bias are discussing diversity in artificial intelligence development.;대규모 데이터 학습을 바탕으로 한 인공지능의 학습 능력 향상이 이루어지고 있다. 인공지능 연구 및 개발 회사인 오픈에이아이(Open AI)가 개발한 챗봇 챗지피티(ChatGPT)는 간단한 질문에 대한 답변뿐만 아니라 코딩, 번역, 소설 작문 등 다양한 분야에서 뛰어난 성능을 보여주고 있다. 그러나 해당 결과물들은 기존 데이터를 바탕으로 학습하여 산출된 결과물이므로, 기존 데이터에 포함된 편향이나 오류를 그대로 담고 있을 가능성이 크다. 인공지능의 다양한 분야 중 하나인 추천 시스템은 사용자들의 취향을 정확히 파악하고 원하는 아이템을 추천해 주는 것이 핵심이다. 추천 시스템에는 다양한 알고리즘들이 사용된다. 아이템의 특성을 파악하고 사용자의 취향과 비슷한 아이템을 추천하는 방법, 선호도가 비슷한 사용자들을 구분하여 추천하는 방법, 사용자들로부터 비슷한 점수를 받은 항목들을 분류하여 추천하는 방법 외에도 다양한 추천 알고리즘을 혼합하여 추천하는 방법이 있다. 추천 시스템 역시 기존 평점 및 피드백 데이터를 바탕으로 새로운 아이템을 추천한다. 알고리즘의 발전에 따라 추천 정확도가 제고되고 있지만 정확도 중심의 기존 추천 방식은 추천의 형해화 문제를 내재하고 있다. 여러 사용자들이 공통적으로 나타내는 패턴은 잘 파악하지만, 사용자들의 적지만 다양한 취향을 반영하는 데는 제약이 있고, 비슷한 아이템 및 기존에 선호된 아이템의 반복적인 추천은 추천의 단조로움으로 인한 고객 이탈 및 만족도 저하 문제를 야기한다. 이에 따라 추천 시스템의 성과를 판단하는데 있어 추천의 정확도 뿐만 아니라 커버리지, 다양성, 참신성 등 기존 선호와 차별화 된 다양한 아이템을 추천함으로써 평가할 수 있는 지표도 고려해야 한다는 필요성이 대두되었다. 이 논문은 사용자가 여러 번 추천 요청을 했을 때 매 요청마다 다양한 아이템의 조합으로 추천이 이루어지게 하는 Inter-list Diversity 개선 방법에 대한 연구이다. 사용자들은 한 번의 추천에 만족하지 않으면 재차 새로운 아이템을 추천 받기 원한다. 또한 연관성 분석(장바구니 분석)의 특징을 고려해보면, 어떤 조합으로 추천하는지도 사용자의 선택 및 만족도에 중요한 영향을 미칠 것임을 알 수 있다. 사용자들의 취향은 다양하고, 각 사용자의 취향에 맞는 추천 조합을 다양하게 제안하기 위해서는 Inter-list Diversity의 제고가 이루어져야 한다. 추천 시스템의 다양성 연구는 대부분 단일 추천에 대한 Intra-user, Intra-list Diversity에 대한 연구가 주를 이루었다. 상대적으로 연구가 미미했던 Inter-list Diversity에 대한 연구는 반복 추천 요청에 대한 다양한 조합의 추천을 통해 추천 결과의 단조로움을 피함으로써 사용자들의 이탈을 방지하고, 장기적인 만족도를 제고하며, 다양한 아이템들이 추천될 수 있는 바탕을 만들 것으로 예상된다. 본 연구에서는 추천 시스템 연구에 가장 많이 사용되는 MovieLens 데이터를 활용하여 Inter-list Diversity 제고를 위한 가중 랜덤 선택 기법을 적용해 보았다. 기본적인 가중 랜덤 선택 기법은 사용자가 선호하는 아이템 장르나 예상 평점이 높은 아이템에 가중치를 두어 랜덤으로 선택되게 하는 방법으로써, Inter-list Diversity를 보장하는 방법으로 활용된다. 그러나 Inter-list Diversity를 충족시키는 동시에 각각의 list 내부의 다양성은 충족시키지 못하는 결과를 가져올 수 있다는 한계가 있었다. 그래서 본 연구에서는 가중 랜덤 선택 기법을 통해 Inter-list Diversity를 제고시키는 동시에 각각 list 내부의 Intra-list Diversity까지 제고할 수 있는 다양한 가중 랜덤 선택 기법들을 제안한다. 처음으로 제안하는 기법은 순차적 가중 랜덤 선택(Sequential Weighted Random Choice) 기법으로, 상위 5개 아이템을 추천할 때 순차적으로 이전 아이템의 장르를 피해가며 추천하는 방법이다. 예측해야 하는 아이템들에 대해 예상 평점에 따라 가중치를 준 후 가중 랜덤 선택 기법을 적용하여 첫 번째 추천 아이템을 선정한 후, 첫번째 아이템과 같은 장르를 가진 아이템들을 배제한 아이템들에 대해 가중 랜덤 선택 기법을 적용하여 두번째 추천 아이템을 선정하는 방식으로 상위 5개 아이템을 선정하는 방식이다. 군집 기반 가중 랜덤 선택(Cluster-based Weighted Random Choice) 기법은 유사한 아이템들을 5개 그룹으로 나눈 후 각 그룹에서 하나씩 가중 랜덤 선택 기법을 적용하여 추천하는 방법이며, 번들화 된 가중 랜덤 선택(Bundled Weighted Random Choice) 기법은 장르를 특정 갯수 이상 포함하는 번들 여러 개를 구성 후 예상평점평균을 가중치로 하여 가중 랜덤 선택 기법을 적용하는 추천 방법이다. 희소 아이템 가중 랜덤 선택(Rare Item Weighted Random Choice) 기법은 상위 5개 아이템을 추천할 때 가장 등장이 적은 장르를 포함한 영화에 가중치를 두어 가중 랜덤 선택 기법을 적용하는 방법이고, 시간 흐름에 따른 가중 랜덤 선택(Time Trend Weighted Random Choice) 기법은 시간을 고려하여, 상위 5개 아이템을 추천할 때 최근에 관심 높아진 장르에 가중치를 주어 가중 랜덤 선택 기법을 적용하는 방법이다. 위와 같이 제안된 5가지의 가중 랜덤 선택 기법들을 바탕으로 추천 아이템의 다양성 제고 정도 및 추천 정확도, 만족도 등을 분석하여 효과적인 기법을 선별해 보았다. 딥러닝의 등장으로 예측력이 여러모로 향상됐지만 다양성 제고에 대한 연구는 아직 충분히 이루어 지지 않고 있는 상황에서, 본 연구는 list 내부의 다양성까지 고려한 Inter-list Diversity를 제고시킬 수 있는 다양한 방법들을 제안함으로써 추천 시스템의 다양성 제고를 위한 알고리즘의 폭을 넓혔다는 데에 의의가 있다. 요약하자면, 본 연구에서는 추천 시스템 분야에서 다양성 문제를 살펴보고, list 내부의 다양성까지 고려한 Inter-list Diversity를 제고시키기 위한 여러가지 가중 랜덤 선택 기법을 제안하였다. 본 연구는 추천 시스템이 추천하는 아이템 집합의 다양성 제고를 통해 소비자의 장기적인 만족도 향상 및 다양한 아이템의 제시에 도움이 될 수 있다. 또한 사용자의 인적 프로필 없이 영화와 그에 대한 장르, 평점 및 시간 정보만 있으면 사용할 수 있는 방법이라는 장점이 있다. 이번 연구가 향후 추천 시스템의 다양성 제고 방안 연구에서 보다 다각도의 방향을 제시하는 기반이 될 것으로 기대한다. 또한, 기존 데이터의 학습을 통한 정확도 제고뿐만 아니라 다양성 측면의 향상을 목표로 하는 본 논문은 기존 데이터의 왜곡 및 편향으로 인한 문제와 사용자들의 확증 편향 심화 문제를 고민하는 많은 AI 과학자들이 인공지능 개발에 있어 다양성을 논의하고 있는 요즘 시대에 매우 시의적절하다.