DSpace at EWHA: Latitude-aware Convolution을 이용한 참조 영상 기반 360도 영상 초해상화

Browse

My Repository

DSpace at EWHA일반대학원 전자전기공학과 Theses_Master

View : 740 Download: 0

Latitude-aware Convolution을 이용한 참조 영상 기반 360도 영상 초해상화

Title: Latitude-aware Convolution을 이용한 참조 영상 기반 360도 영상 초해상화

Other Titles: 360-degree Reference-based Super-Resolution using Latitude-aware Convolution

Authors: 김희재

Issue Date: 2021

Department/Major: 대학원 전자전기공학과

Publisher: 이화여자대학교 대학원

Degree: Master

Advisors: 이병욱

Abstract: 360도 영상은 field-of-view (FOV)의 제한 없이 취득 공간의 정보를 모두 포함하는 전방위 비디오로 높은 시청 자유도를 제공한다. 다시점에서 취득한 4K 이상의 고해상도 360도 영상 시퀀스는 높은 몰입감의 시청 환경 재구성에 도움이 되지만 데이터 용량이 매우 크기 때문에 효율적인 전송이 어렵다. 예를 들어, 자율주행 로봇 시스템 등에서 관측자(agent)의 위치에 따라 변하는 360도 영상을 독립적으로 보내려면 데이터 용량이 그 위치의 개수만큼 선형적으로 증가한다. 이를 개선하기 위해 본 논문은 360도 영상을 위한 딥러닝을 이용한 참조 영상 기반의 초해상화 기법을 연구한다. 제안 기법은 비교적 멀리 떨어진 임의의 시점에서 취득한 참조 영상의 유용한 텍스처 정보를 활용하여 목표 시점의 저해상도 영상을 초해상화 한다. 이를 위해 convolution layer를 통해 추출한 입력 영상들의 feature 간 correlation을 분석하여 화소 단위의 대응 관계를 벡터 형태로 출력하고 참조 영상을 목표 시점으로 정렬한다. 이때 참조 영상 기반의 360도 영상 초해상화를 위한 세 가지 문제점을 중점적으로 해결한다. 첫째로 서로 다른 시점에서 취득된 360도 영상 간의 correspondence는 시차에 의해 현저히 다른 모양과 크기로 왜곡되기 때문에 대응 관계를 파악하기에 어려움이 있음을 고려한다. 이러한 왜곡을 효율적으로 처리하기 위해 위도에 따라 적응적으로 convolution의 dilation rate를 변형하여 receptive field를 조절하는 것을 제안한다. 두 번째로 추정된 시차 정보를 이용해 목표 시점으로 정렬된 360도 영상에 남아있는 잘못 정렬된 화소가 결과 영상에서 심각한 성능 저하를 야기함을 고려한다. 이러한 화소는 왜곡이 심한 영역에서 시차가 잘못 추정되거나 occlusion에 의해 보이지 않는 정보가 원하지 않는 화소 값으로 채워지기 때문에 발생한다. 이를 개선하기 위해 정렬된 참조 영상의 각 화소의 가중치를 학습하여 잘못된 화소를 적절히 필터링한다. 마지막으로 비교적 적은 양의 real 영상으로도 우수한 성능의 모델을 학습할 수 있는 transfer learning 기법을 제안한다. 참조 영상 기반 초해상화 네트워크를 충분히 학습하기 위한 real 영상 데이터가 부족하기 때문에, transfer learning을 통해 synthetic 도메인에서 학습된 parameter들을 real 도메인으로 효과적으로 전달한다. 본 논문은 제안 네트워크를 최신 단일 영상 기반 초해상화 기법 및 참조 영상 기반 초해상화 기법들과 비교하고 제안 모델의 정량적, 정성적 우수함을 증명하였다. ;The 360° imagery is gaining substantial attention in recent years. It is usually recorded from multiple viewpoints to allow more immersive experience to users. To provide 360° data in high quality, reference-based super-resolution (RefSR) can be effectively utilized to process such large data. However, RefSR remains challenging for 360° imagery due to severe geometric distortion depending on the latitude and deficiency of real dataset. In light of these shortcomings, we propose a long-range 360 disparity estimator to extract reliable textures from the reference image with the help of latitude aware convolution. In contrast to previous disparity estimator, the proposed scheme generates robust features for similarity searching by considering the amount of distortion. Moreover, we present transfer learning scheme where the knowledge learned from synthetic images are transferred to real image domain to deal with dataset deficiency issue. We first train the model to learn the general characteristics of ERP with synthetic dataset and employ the model for training real images. With the proposed training scheme, we successfully adapt the model to real images by adding only few parameters to be updated. This is the first end-to-end trainable reference-based super-resolution method for ERP and the experimental results show that the proposed model outperforms on both synthetic and real dataset compared to previous work. Further research includes more accurate flow estimator for ERP images, which is essential for RefSR or view synthesis.