運用混合小波封包與離散餘弦轉換及
最佳位元配置之高音質音訊壓縮系統

摘要

小波分頻的訊號壓縮技術已被廣泛地應用在音視訊編碼系統中,其優越的性能充分顯現在靜態影像之壓縮技術上。本論文提出混合小波與離散餘弦轉換之音訊壓縮系統,以小波封包分頻方式,將樂音訊號經由濾波器群組分成26個次頻帶,再根據時域與頻域之平坦程度,決定是否要進一步執行離散餘弦轉換。本系統並採用非理想合成濾波器之最佳位元配置演算法,將人耳聲學模型所得出的頻域最小遮蔽臨界值,轉換成小波域上的遮蔽臨界值,以提供精良的量化準則。其後以均勻量化器配合小波域的遮蔽臨界值,大幅降低資料量並仍保有極高的音質,最後再以算術編碼將量化後的係數做進一步的熵編碼並封裝成位元流。實驗結果顯示,本系統僅需52 kbps即可達到MP3 64 kbps的音質;另外,在同樣64 kbps之位元率下,本系統所提供的音質不但優於MP3AAC低複雜度規格,更可超越AAC高效率規格。

 

 

 

Hybrid Wavelet Packet and Discrete Cosine Transform with Optimum Bit Allocation Applied to
High-Quality Audio Coding

 Abstract

The wavelet filter bank analysis-synthesis technique has been widely applied to many areas of digital signal processing, especially in image and video coding. In this thesis, we propose a hybrid Wavelet Packet and DCT audio compression system, which divides the audio signal into 26 subbands via Wavelet Packet analysis and selectively performs DCT in each subband according to the flatness measure of time and frequency of this subband. The proposed coder adopts optimum bit allocation with nonideal reconstruction filters to transform the minimum masking threshold in frequency domain obtained from psychoacoustic model into the masking threshold in Wavelet domain. The WP or DCT coefficients are then quantized with uniform quantizers according to masking threshold, so that we can reduce the data rate but still have high quality. Finally, the quantized coefficients are encoded with arithmetic coding and encapsulated with other side information. The experiments show that, only 52 kbps is needed for proposed audio coder to achieve MP3 64-kbps quality. At the same bit rate of 64 kbps, the proposed audio coding system can provide not only better quality than MP3 and AAC LC profile but also superior to AAC HE profile!