高品質切換式離散餘弦與小波封包
轉換之音訊編碼技術

摘要

  轉換編碼應用於音訊處理系統中已行之有年,而近年來最熱門的分頻編碼法則莫過於小波轉換。其多層解析之特性,使得分頻處理之選擇架構更為多元。本論文提出一套混合式高音質壓縮系統,在處理每個音框資料時,首先即依據頻率域之平坦度量測以決定使用頻率解析度較佳之離散餘弦轉換或具豐富時間資訊之小波轉換作為此音框的主要轉換方式。對於採用離散餘弦轉換之音框,將依循人耳聲學模型所算出之頻域遮罩配合非線性量化器作量化。若是選擇運用小波封包做分頻,則樂音訊號通過小波濾波器組後即分成26個固定子頻帶,並在各子頻帶之後再依據小波域與頻率域平坦度之量測選擇性地使用離散餘弦轉換以提升頻率解析度。配合針對非理想濾波器組之最佳化位元分配演算法,將頻域上之人耳聽覺遮蔽曲線轉換為小波域之遮蔽曲線,以提供精良之量化準則並保有極高之音質。最後再以熵編碼,將量化後係數封裝成位元流。實驗結果顯示,在64k位元率的情況下,本系統所提供之音質,不僅優於MP3,更能超越AAC低複雜度規格。

 

 

 

High Quality Switched Discrete Cosine
Transform and Wavelet Packet
Audio Coding Technique


Abstract

    We propose a hybrid coding system that utilizes both Wavelet Packet (WP) and DCT techniques. To process each audio frame, the system selects either WP or DCT to process based on the frame flatness measures in wavelet domain and frequency domain. If DCT is chosen, all DCT coefficients are quantized by a non-uniform quantizer according to the frequency masking curve. On the other hand, frame data are segmented into 26 fixed subbands when WP is chosen. Then, the system selectively utilizes DCT to promote frequency resolution of each subband based on the subband flatness measure. By quoting optimal bit-allocation for non-ideal filter bank, the masking threshold from psychoacoustic model can be translated into specific criteria in the wavelet domain for quantization.  Experiment results show that the proposed system is superior to MP3 and AAC LC profile at 64k bps.