Audio Compression Using Wavelet Packets

and a Zero-Tree Coder

with Psychoacoustic Modeling

Abstract

The wavelet filter bank analysis-synthesis technique has been popularly applied in many areas of digital signal processing, including audio and video coding. The embedded zero-tree wavelet (EZW) coding has shown great performance in progressive image coding. In this work, we focus on high quality audio coding which delivers transparent perceptual quality. The segmented audio signal is divided into 29 subbands via wavelet packet analysis, and then coded by a zero-tree coder with the modified algorithm based on the minimum masking thresholds which are generated by the psychoacoustic model. Subjective listening tests show that the Masking-Embedded Zero-tree Wavelet Packet (M-EZWP) system we propose has better performance compared with MPEG audio Layer II standard, especially in the case of very low bitrate. The perceptual transparent quality of monophonic audio can be achieved at about 40 Kbps. Furthermore, the M-EZWP system could be adjusted to various network conditions, such as VBR and CBR transmissions because of the embedded property.