

# Complexity Control for H.264 Video Encoding over Power-Scalable Embedded Systems

Ming-Chen Chien\*,†, Jhen-Yuan Huang\*, and Pao-Chi Chang\*, *Member, IEEE*

\*Dept. Electrical Engineering, National Central Univ., Jhongli, Taiwan

†Dept. Electrical Engineering, Chin Min Institute of Technology, Miaoli, Taiwan

**Abstract**— This work proposes a complexity-control algorithm that efficiently utilizes encoding tools of an H.264 video coding system under different power status. Experiments performed on a power-scalable embedded system<sup>1</sup> reveal the excellent rate-distortion performance with various power constraints. Specifically, the power consumption can be adjusted around 77% to 100% with PSNR degradation within 1.08dB at the same bit rate.

**Keywords**— *H.264 video coding; complexity control; embedded system*

## I. INTRODUCTION

With the rapid advances of wireless communication technology, video applications such as video conference and video recording for portable devices are expected to be the mainstream in the future. However, the allowable computational complexity of real time video encoding for a portable device is generally limited. Furthermore, power saving becomes an important concern in this age [1]. With Dynamic Voltage Scaling (DVS) technology [2], the power  $P$  of a system can be adjusted for efficient use. The allowable complexity  $C$  also changes with the power supply as stated in (1) [3]. Figure 1 presents an example [4] of a power-scalable system in which the power consumption increases almost linearly with the clock frequency.

$$C = f(P) \quad (1)$$

To achieve high compression ratio, H.264 video coding utilizes many encoding tools, such as various block sizes and various pixel accuracy for motion estimation (ME), various block sizes for intra prediction, DCT, quantization, entropy coding, deblocking filter, and so on [5]. However, the full-scale encoder is associated with large computational complexity and hence may not be suitable for a complexity-constrained system. For a given power status  $P$  and data rate  $R$ , complexity control aims to maximize the video quality measurement ( $QM$ ) under the constraints:

$$\max \{QM(C, R)\} \quad s.t. \quad C = f(P) \quad (2)$$

A few works of complexity control have been conducted [3],[6],[7],[8]. The first C-R-D model which formulates the complexity allocation among encoding tools has been proposed [3]. However, this optimization formula is too



Fig. 1 Power consumption of a power-scalable system

complicated to be solved into close form and finding the solution by conducting a global search requires a large computational overhead.

A statistical optimal operation mode for a sequence in a complexity-constrained H.263 video encoding system has also been proposed [6]. However, the encoding tools of a H.263 video encoder are much different from the encoding tools of the modern H.264 video encoder. A complexity allocation method for ME based on the cost-complexity curve has been proposed [7]. A C-R-D optimization for H.264 ME has also been proposed [8]. It proposed two Lagrange multipliers to terminate the complexity-inefficient ME rounds and thus increase coding efficiency. To the best of our knowledge, no practical and efficient complexity control for an H.264 video encoder that operates in real time exists for a portable device..

This work considers a programming optimized H.264 video encoder running on a power-scalable portable system, which is a practical case. Since some of the encoding tools have high coding efficiency while others only have relatively poor coding efficiency, this work proposes an algorithm to efficiently utilize encoding tools of H.264 video coding to achieve maximum  $QM$  under different power status.

The rest of the paper is organized as follows. We analyze the coding efficiency of each encoding tool on an embedded system in Section II. In Section III we propose a power-scalable encoding algorithm. Experimental results are given in Section IV, and we conclude in Section V.

<sup>1</sup> The Intel XScale processor is commercially available as the PXA270.

## II. ANALYSIS OF CODING EFFICIENCY

Coding efficiency analysis is performed by running a programming optimized H.264 video encoder, X264 [9], on a power-scalable embedded system<sup>1</sup> for several video sequences. Options for the analysis are listed as in Table I. Since Sum of Absolute Difference (SAD) function is frequently utilized for ME, this work optimizes SAD computation by using Single Instruction Multiple Data (SIMD) technology [4].

Figure 2 is the complexity profile of each encoding tool of H.264 encoder, where interpolation is an essential part of ME at fractional pixel accuracy. We can observe that ME and mode decision are associated with most computational complexity. DCT is associated with low computational complexity, because its operation is simplified in a H.264 encoder. To design a complexity-scalable H.264 encoder, we can focus on analyzing the ME and dividing this encoding tool into more options.

Four variables s1, s2, s3, and s4 are adopted [6] to analyze the coding efficiency of various encoding tools. The meanings of them are described as following:

- s1: type of entropy coding (B – CABAC; V – CAVLC)
- s2: use de-blocking filter or not (1 – use; 0 – not use)
- s3: pixel accuracy of ME (2 – quarter; 1 – half; 0 – full)
- s4: available block sizes for ME (2 – all; 1 – 16x16, 16x8, 8x16, 8x8; 0 – 16x16)

TABLE I.

THE ENCODING OPTIONS FOR ANALYSIS OF CODING GAIN

| Video source (QCIF)        | Stefan, Foreman, Susie, Akiyo |
|----------------------------|-------------------------------|
| Fast ME                    | Diamond                       |
| Target rate (kbps)         | 50, 100, 200                  |
| Frame rate                 | 15                            |
| Number of reference frames | 1                             |
| GOP type                   | IPPP                          |
| RAM                        | 64M bytes                     |
| Source code                | X264 v655                     |
| MMX tech.                  | for SAD computation           |
| Platform                   | PXA270                        |



Fig. 2 Complexity profile of x264



Fig. 3. Results of coding efficiency analysis: Foreman at bit rate 200kbps



Fig. 4. Results of coding efficiency analysis: Stefan at bit rate 200kbps



Fig. 5. Results of coding efficiency analysis: Susie at bit rate 200kbps



Fig. 6. Results of coding efficiency analysis: Akiyo at bit rate 200kbps

Based on the different operation modes, results of the complexity analysis for different sequence are plotted in Fig. 3~6, where the symbol on the chart represents the combination of variables (s1 s2 s3 s4). For example, B100 represents using CABAC, deblocking filter, full pixel ME, and 16x16 block size. The dots represent possible combinations of encoding tools. The line marks the path with the highest coding efficiency under 200kbps.

### III. CODING-LEVEL BASED COMPLEXITY CONTROL

According to the analysis of coding efficiency, this work factorizes the utilization of encoding tools into eight coding levels (*CL*) as Table II shows, where *CL* 1, including 16x16 block size for ME with full pixel accuracy, 16x16 Intra prediction, DCT, quantization, and CAVLC, means the level associated with the lowest complexity. More encoding tools are used for higher *CL*, thus *CL* 8 uses all encoding tools and consumes the highest power consumption. Experimental results under various power status and data rates are plotted in Fig. 7 that can equivalently be listed in a *CL* vs. power and rate (*CLPR*) table, as shown in Table II.

The video quality is proportional to frame rate. However, the computational complexity of H.264 video encoding is also proportional to frame rate [3]. As we describe above, the higher computational complexity consumes higher power consumption. For the power-constrained case, selecting an optimal frame is an important issue. A video quality metric considering frame rate has been proposed [10] as

$$QM = PSNR + a \cdot m^b (30 - FR) \quad (1)$$

where *a* is equal to 0.986, *b* is equal to 0.378, *FR* denotes frame rate, *m* represents the parameter of motion speed, and *PSNR* denotes the average PSNR with the skipped frames replaced by repeating the previous frame. Fig. 8 shows the video quality encoded with various *CL* versus power

consumption at different frame rate. According to Fig. 8, the optimal frame and the optimal *CL* can be determined for a given power constraint. For example, if the power constraint is 1200 mW, the optimal frame rate is 15 and the optimal *CL* is *CL* 1. Therefore, a *FCLPR* table can be built to determine the optimal frame rate and the optimal *CL* under various power constraint at different bit rate.

The strategy for complexity control can be summarized by the following algorithm:

- Step 1: detect battery status.
- Step 2: use dynamic voltage scaling (DVS) to select CPU clock frequency.
- Step 3: select the optimal frame rate and the optimal coding level (*CL*) by looking up *FCLPR* table.

This algorithm should be performed every few minutes till the end of the encoding job or running out of battery.

TABLE II.  
THE ORDER OF THE LIST OF ENCODING TOOLS,  
FOREMAN AT BIT RATE: 200KBPS

| <i>CL</i> | Mode transition    | Complexity of a frame (MHz) | Encoding tool         |
|-----------|--------------------|-----------------------------|-----------------------|
| 1         | V000               | 57.71                       | CAVLC                 |
| 2         | V000→V100          | 60.19                       | CL1+Deblocking Filter |
| 3         | V100→B100          | 65.24                       | CL2+CABAC-CAVLC       |
| 4         | B100→B101          | 75.36                       | CL3+8x8 Partition     |
| 5         | B101→B111          | 89.35                       | CL4+Half Pixel ME     |
| 6         | B111→B121          | 91.07                       | CL5+Quarter Pixel ME  |
| 7         | B121→B122          | 98.86                       | CL6+Sub8x8 Partition  |
| 8         | B122 with Intra4x4 | 100.19                      | CL7+Intra 4x4         |



Fig. 7. Video quality vs. power consumption (Foreman)



Fig. 8. Video quality vs. power consumption at different frame rate

#### IV. EXPERIMENTAL RESULTS

The experiment environment is listed in Table I. Figure 9 shows the quality at bitrates 50Kbps for Foreman and shows that the power consumption can be adjusted around 77% to 100% with PSNR degradation within 1.08dB. Depending on the power status of battery, a different  $CL$  is chosen. While excellent video quality is obtained with  $CL$  8 for full battery status, the video quality for low battery status with  $CL$  1 is still acceptable.

#### V. CONCLUSION

This paper proposes a power-control algorithm to efficiently utilize encoding tools of H.264 video codecs under different power status and data rates. This algorithm provides the flexibility to extend the battery life substantially while maintaining acceptable video quality.

#### REFERENCES

- [1] C. J. Lian, S. Y. Chien, C. P. Lin, P. C. Tseng, and L. G. Chen, "Power-Aware Multimedia: Concepts and Design Perspectives," *IEEE Circuits and Systems Magazine*, Volume 7, Issue 2, 2007.
- [2] R. Min, T. Furrer, and A. Chandrakasan, "Dynamic voltage scaling techniques for distributed microsensor networks," *Proc. IEEE Computer Society Workshop VLSI (WVLSI'00)*, pp. 43–46, Apr. 2000.
- [3] Z. He and Y. F. Liang, "Power-Rate-Distortion analysis for wireless video communication under energy constraints," *IEEE Trans. Circuits Syst. Video Tech.*, vol. 15, no. 5, pp. 645-658, May 2005.
- [4] Intel Data Sheet, *Intel PXA270 Processor, Electrical, Mechanical, and Thermal Specification*, 2005.
- [5] ISO/IEC ITU-T Rec. H264: Advanced Video Coding for Generic Audiovisual Services, Joint Video Team (JVT) of ISO-IEC MPEG & ITU-T VCEG, Int. Standard, May 2003.
- [6] D. N. Kwon and P. F. Driessens, "Performance and computational optimization in configurable hybrid video system," *IEEE Trans. Circuits Syst. Video Tech.*, vol. 16, no. 1, pp. 31-42, Jan. 2006.
- [7] C. Kim and J. Xin, "Hierarchical complexity control of motion estimation for H.264/AVC," *Mitsubishi Electric Research Laboratories*, TR2006-004, Dec. 2006. Available: <http://www.merl.com>.
- [8] Y. Hu, Q. Li, S. Ma, and C. C. J. Kuo, "Joint rate-distortion-complexity optimization for H.264 motion search," in *Proc. ICME 2006*, pp. 1949-1952.
- [9] x264, available at <http://www.videolan.org/developers/x264.html>
- [10] R. Feghali, F. Speranza, D. Wang, and A. Vincent, "Video Quality Metric for Bit Rate Control via Joint Adjustment of Quantization and Frame Rate," *IEEE Transaction on Broadcasting*, Vol. 53, no. 1, pp. 441-446, March 2007.



Fig. 9. Configurable encoder, foreman at bit rate 50kbps, (a) full power, (b) 3/4 power, (c) half power, (d) 1/4 power.