DSpace at EWHA: Sequential Cross Attention Based Multi-Task Learning

Browse

My Repository

DSpace at EWHAETC ETC

View : 343 Download: 0

Sequential Cross Attention Based Multi-Task Learning

Title: Sequential Cross Attention Based Multi-Task Learning

Authors: 김선경

Issue Date: 2022

Department/Major: 대학원 인공지능·소프트웨어학부

Publisher: 이화여자대학교 대학원

Degree: Master

Advisors: 민동보

Abstract: In multi-task learning (MTL) for visual scene understanding, it is crucial to transfer useful information between multiple tasks with minimal interferences. In this thesis, we propose a novel architecture that effectively transfers informative features by applying the attention mechanism to the multi-scale features of the tasks. Since applying the attention module directly to all possible features in terms of scale and task requires a high complexity, we propose to apply the cross-attention module (CAM) sequentially for the task and scale. The cross-task attention module (CTAM) is first applied to facilitate the exchange of relevant information between the multiple task features of the same scale. The cross-scale attention module (CSAM) then aggregates useful information from feature maps at different resolutions in the same task. Also, we attempt to capture long range dependencies through the self-attention module in the feature extraction network. In addition, in order to reduce the computational resources of the proposed model, we build a student model and use stage wise knowledge distillation. We divide the features of the teacher into four stages so that knowledge of the teacher model can be transferred to the lightweight student model progressively through the four stages. Extensive experiments demonstrate that our method achieves high performance on the NYUD-v2 and PASCAL-Context dataset.;최근 시각적 장면 이해를 위한 다중 작업 학습에서는 최소한의 간섭으로 여러 작업 간에 유용한 정보를 전달하는 것이 중요하다. 본 논문에서는 주의 메커니즘을 적용하여 다른 작업과 크기 정보를 효과적으로 전달하는 새로운 구조를 제안한다. 주의 모듈을 규모 및 작업 측면에서 가능한 모든 특징에 직접 적용하는 것은 높은 복잡성을 요구하므로 작업 및 규모에 대해 교차 주의 모듈 (CAM)을 순차적으로 적용할 것을 제안한다. CTAM (교차 작업 주의 모듈)은 먼저 동일한 규모의 여러 작업 간에 관련 정보 교환을 용이하게 하기 위해 적용된다. 그 후 CSAM (Cross-Scale Attention Module)은 동일한 작업에서 서로 다른 해상도의 유용한 정보를 집계한다. 추가적으로 본 논문에서는 지식 증류를 사용하여 경량화 된 모델을 설계한다. 새로 설계된 경량화 된 모델은 이전의 제안한 모델로부터 특징을 전달 받아 기존의 모델과 유사한 성능을 달성한다. 최종적으로, 다중 작업 학습에서 많이 활용되고 있는 NYUD-v2 및 PASCAL-Context 데이터 셋에서 본 논문에서 제시한 방법이 고성능을 달성하였음을 확인하였다.