Improved SIMD Architecture For High Performance Video Processors
Improved SIMD Architecture For High Performance Video Processors
Abstract – SIMD execution is in no doubt an efficient way to exploit the data level parallelism in image and video
applications. However, SIMD execution bottlenecks must be tackled in order to achieve high execution efficiency. We
first analyze in this paper the implementation of two major kernel functions of H.264/AVC namely, SATD and subpel
interpolation, in conventional SIMD architectures to identify the bottlenecks in traditional approaches. Based on the
analysis results, we propose a new SIMD architecture with two novel features: (1) parallel memory structure with
variable block size and word length support; and (2) configurable SIMD structure. The proposed parallel memory
structure allows great flexibility for programmers to perform data access of different block sizes and different word
lengths. The configurable SIMD structure allows almost “random” register file access and slightly different
operations in ALUs inside SIMD. The new features greatly benefit the realization of H.264/AVC kernel functions.
For instance, the fractional motion estimation, particularly the half to quarter pixel interpolation, can now be
executed with minimal or no additional memory access. When comparing with the conventional SIMD systems, the
proposed SIMD architecture can have a further speedup of 2.1X to 4.6X when implementing H.264/AVC kernel
functions. Based on Amdahl’s law, the overall speedup of H.264/AVC encoding application can be projected to be
2.46X. We expect significant improvement can also be achieved when applying the proposed architecture to other
image and video processing applications.
Keywords – Configurable SIMD, Parallel memory structure, SIMD bottlenecks, video codec processor
Fig.1. Process Design Gap (Figure shows drawn Vs VI. CONCLUSION (12, BOLD, SMALL CAPS)
printed gap increases as we move towards smaller device
size) A conclusion section is not required. Although a
conclusion may review the main points of the paper, do
2. Table Captions not replicate the abstract as the conclusion. A conclusion
Tables must be numbered using numbers. Table might elaborate on the importance of the work or suggest
captions must be centred and in 10 pt Time new roman applications and extensions.
Regular font. Every word in a table caption must be in
regular font. Captions with table numbers must be placed APPENDIX (12, BOLD, SMALL CAPS)
before their associated tables. Tables should not be
images. The contents of the table must be in 10 point
times new roman regular font. The heading of the Appendix section must not be
Table I: FONT SIZES FOR PAPERS numbered. Appendixes, if needed, appear before the
Font Appearance (in Time New Roman or Times) acknowledgment.
Size Regular Bold Italic
10 table caption, reference item
figure caption, (partial)
reference item
ACKNOWLEDGMENT
10 author email abstrac abstract heading
address, t body (also in Bold)
cell in a table The heading of the Acknowledgment section must not
12 level-1 heading level-2 heading, be numbered. The preferred spelling of the word
(in Small Caps), level-3 heading, “acknowledgment” in American English is without an
paragraph author affiliation “e” after the “g.” Use the singular heading even if you
have many acknowledgments. Avoid expressions such as
11 author name “One of us (S.B.A.) would like to thank ... .” Instead,
20 title write “F. A. Author thanks ... .” Sponsor and financial
support acknowledgments are placed in the
unnumbered footnote on the first page.
V. SOME HELPFUL HINTS
REFERENCES (12, BOLD, SMALL CAPS)
1. Equations
Equations should be numbered consecutively The heading of the References section must not be
throughout the paper. The equation number is enclosed in numbered. All reference items must be in 9 pt font.
parentheses and placed flush right, as in (1). Your Please use Regular and Italic styles to distinguish
equation should be typed using the Times New Roman different fields as shown in the References section.
font (please no other font). To create multileveled Number the reference items consecutively in square
equations, it may be necessary to treat the equation as a brackets (e.g. [1]).
graphic and insert it into the text after your paper is When referring to a reference item, please simply use
styled. the reference number, as in [2]. Do not use “Ref. [3]” or
If you are using Word, use either the Microsoft “Reference [3]” except at the beginning of a sentence,
Equation Editor or the MathType add-on e.g. “Reference [3] shows …”. Multiple references are
(http://www.mathtype.com). each numbered with separate brackets (e.g. [2], [3], [4]–
[6]).
dny dy
f n ( x) n
.... f1 ( x ) f 0 ( x) y h( x) (1)
[1] David Z. Pan, Senior Member, IEEE, Bei Yu, and Jhih-
dx dx
Rong Gao “Design for Manufacturing With Emerging
2. Abbreviations and Acronyms
Nanolithography” IEEE Transactions on Computer-
Define abbreviations and acronyms the first time they Aided Design of Integrated Circuits And Systems, Vol.
are used in the text, even after they have already been 32, No. 10, October 2013 (9, Regular)
defined in the abstract. Abbreviations such as SI, ac, and
[2] M. Lu, et al., “Novel customized manufacturable DFM
solutions,” Proc. SPIE Photo mask Technology 2012,
vol. 8522, pp. 852223, December 2012.
[3] Sergio Gomez and Francesc Moll. “Lithography aware
regular cell design based on a predictive technology
model.” J. Low Power Electronics, 6(4):1–14, 2010
[4] B. Le Gratiet, F. Sundermann, J. Massin, et al.,
“Improved CD control for 45-40 nm CMOS logic
patterning: anticipation for 32-28 nm”, In proceedings
of SPIE Vol. 7638,76380A (2010)
[5] Shi-Hao Chen, Ke-Cheng Chu, Jiing-Yuan Lin and
Cheng-Hong Tsai “DFM/DFY practices during physical
designs for timing, signal integrity, and power” 2007
IEEE conference.
[6] Wing Chiu Tam and Shawn Blanton “To DFM or Not to
DFM” IEEE Asia Pacific Conference on Circuits and
Systems, 2006.
[7] Raina Rajesh “What is DFM & DFY and Why Should I
Care?” INTERNATIONAL TEST CONFERENCE 2009
[8] Garg Manish, Kumar Aatish “Litho-driven Layouts for
Reducing Performance Variability” IEEE 2005
[9] Daehyun Jang, Naya Ha, Joo-Hyun Park, Seung-Weon
Paek “DFM Optimization of Standard Cells
Considering Random and Systematic Defect”
International SoC Design Conference 2008
[10] Sergio Gomez, Francesc Moll, Antonio Rubio “Design
Guidelines towards Compact Litho-Friendly Regular
cells” SPIE Photomask Technology 2012
[11] "Design for Manufacturability"
http://www.mentor.com/blogs/
[12] “Litho Friendly Design kit, a tool of DFM strategy”,
(http://www.eetimes.com/electrical-
engineers/education-training/tech-
papers/4130133/Litho-Friendly-Design-Kit-A-Tool-of-
DFM-Strategy).
[13] Y. Borodovsky, “Lithography 2009 overview of
opportunities,” in Proc.Semicon West, 2009.
[14] J. A. Torres, “Layout verification in the era of process
uncertainty: Target process variability bands versus
actual process variability bands,” in Proc. SPIE Design
Manufacturability through Design-Process Integration
II, vol. 6925. 2008, pp. 692509-1–692509-8.
[15] A. Carlson and T.-J. Liu, “Negative and iterated spacer
lithography processes for low variability and ultra dense
integration,” in Proc. SPIE Optical Microlithography
XXI, vol. 6924. 2008, pp. 69240B-1–69240B-9.