An optimized FPGA design of inverse quantization and transform for HEVC decoding blocks and validation in an SW/HW environment


Creative Commons License

Ben Atitallah A., Kammoun M., Ben Atitallah R.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, vol.28, no.3, pp.1656-1672, 2020 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 28 Issue: 3
  • Publication Date: 2020
  • Doi Number: 10.3906/elk-1910-122
  • Journal Name: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, TR DİZİN (ULAKBİM)
  • Page Numbers: pp.1656-1672
  • Galatasaray University Affiliated: Yes

Abstract

This paper presents an optimized hardware architecture of the inverse quantization and the inverse transform (IQ/IT) for a high-efficiency video coding (HEVC) decoder. Our highly parallel and pipelined architecture was designed to support all HEVC Transform Unit (TU) sizes: 4 x 4, 8 x 8, 16 x 16, and 32 x 32. The IQ/IT was described in the VHSIC hardware description language and synthesized to Xilinx XC7Z020 field-programmable gate array (FPGA) and to TSMC 180 nm standard-cell library. The throughput of the hardware architecture reached in the worst case a processing rate of up to 1080 p at 33 fps at 146 MHz and 1080 p at 25 fps at 110 MHz when mapped to FPGA and standard-cells, respectively. The validation of our architecture was conducted on the ZC702 platform using a Software/Hardware (SW/HW) environment in order to evaluate different implementation methods (SW and SW/HW) in terms of power consumption and run-time. The experimental results demonstrate that the SW/HW accelerations were enhanced by more than 70% in terms of the run-time speed relative to the SW solution. Besides, the power consumption of the SW/HW designs was reduced by nearly 60% compared with the SW case.