基於平穩主觀視覺之H.264時間可調性視訊編碼位元率分配機制

位元率分配機制

摘要

H.264可調式視訊編碼是以 H.264/AVC 為基底來進行可調式視訊編碼，且提供三種可調適架構包含時間、空間與品質可調性，使系統不需經過重新轉碼即可因應不同使用者的軟/硬體及網路環境條件。其中時間層可調性架構可同時提供不同畫面播放率之視訊資料供使用者切換。傳統上使用H.264/SVC時間層上量化參數的建議設定，在切換畫面率的時後會造成不小的視覺品質差距，因此在實際的應用上，如何有效率的分配位元率給不同畫面播放率的時間層以達到縮減連續兩層之間的主觀視覺品質，是十分重要的議題。

本論文提出一套在網路資源有限的情況下，考量使用者主觀視覺感受的H.264可調式視訊編碼時間層位元率分配機制，本論文所提出的方法可以讓視訊在頻寬變動而需切換畫面播放率的情況下，降低不同畫面播放率主觀視訊品質之差值。實驗結果顯示，所提方法可有效率的分配位元率使各畫面播放率時間層達到較為接近的主觀視覺品質，且對於不同視訊在不同的頻寬限制下均表現良好。與傳統建議的量化參數設定相比，最多可使原本主觀視覺差異從4.03dB降至2.8dB。

H.264/SVC Rate Allocation based on Graceful Degradation of Subjective Quality in Frame Rate Switching

ABSTRACT

H.264 scalable extension (SVC), which is constructed based on H.264/AVC, is the most recent scalable video coding standard. H.264/SVC is incorporate temporal, spatial, and SNR scalability so that not only it has high compression efficiency but also the encoded stream can be adapted to heterogeneous user/network environments without transcoding. Temporal scalability can support multiple display frame rates with a wide range of bitrates. When we adopt the JVT recommended QP setting for H.264/SVC temporal layers, a big subjective quality gap between different layers is occurred in frame rate switching. Thus how to efficiently allocate a given total bitrate among multiple temporal layers to reduce the difference of subjective quality is an important issue.

This thesis proposes a rate allocation method for SVC temporal scalability based on perceptual quality metric. The proposed method gracefully lowers video quality in frame rate switching under the circumstance of bandwidth fluctuation. We utilize the subjective quality metric, instead of the conventional objective measurement PSNR, to measure video quality. Each temporal layer is measured by the subjective quality metric and allocated with the corresponding rate to achieve closer subjective quality between different frame rates. In simulations, several video sequences with various total rate constraints are experimented. The proposed method can efficiently allocate the rate for each temporal layer with closer subjective video quality when the bandwidth is insufficient. Compared with the JVT recommended method, the difference of subjective quality is reduced from 4.03dB to 2.8dB.