View : 321 Download: 0

Reinforcement Learning for Rate-Distortion Optimized Hierarchical Prediction Structure

Title
Reinforcement Learning for Rate-Distortion Optimized Hierarchical Prediction Structure
Authors
Lee J.-K.Kim N.Kang J.-W.
Ewha Authors
강제원
SCOPUS Author ID
강제원scopus
Issue Date
2023
Journal Title
IEEE Access
ISSN
2169-3536JCR Link
Citation
IEEE Access vol. 11, pp. 20240 - 20253
Keywords
adaptive GOPdeep Q-networkhierarchical B predictionrate-distortion optimizationReinforcement learningVVC
Publisher
Institute of Electrical and Electronics Engineers Inc.
Indexed
SCIE; SCOPUS scopus
Document Type
Article
Abstract
Video coding standards use a prediction structure to arrange video frames and exploit temporal correlations. In this aspect, it is crucial to resolve complicated temporal dependencies among frames to improve coding efficiency because the coding of a preceding frame affects the rate-distortion (R-D) performance of the subsequent frames. Previous algorithms have attempted to address the problem using handcrafted features or analytical models even though natural videos display various temporal characteristics. In this paper, we propose a reinforcement learning (RL)-based decision algorithm to build the optimal hierarchical prediction structure under a random-access configuration (RA-HPS) in Versatile Video Coding (VVC). Our goal is to maximize coding efficiency by selecting a series of optimal group of pictures (GOP) structures for coding. Accordingly, we formulate an adaptive GOP selection algorithm with a binary tree to represent a policy. We generate an optimal binary tree to minimize the sum of the R-D costs among all plausible binary trees. A new RL policy representation is defined, and the optimal policy is obtained by a sequential update. The tree grows with a hierarchical state-action and a reward sequence in each node. For efficient learning, the proposed technique uses a deep Q-network architecture to capture the temporal correlation between frames, which helps learn the policy of the tree-based RL framework effectively. Experimental results demonstrate that the proposed technique achieves a significant Bjontegaard-Delta (BD)-rate reduction compared with state-of-the-art GOP size-selection algorithms. © 2013 IEEE.
DOI
10.1109/ACCESS.2023.3249284
Appears in Collections:
공과대학 > 전자전기공학전공 > Journal papers
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

BROWSE