Abstract
We present SpExSim, a software tool for quickly surveying legacy code bases for kernels that could be accelerated by FPGA-based compute units. We specifically aim for low development effort by considering the use of C-based high-level hardware synthesis, instead of complex manual hardware designs. SpExSim not only exploits the spatially distributed model of computation commonly used on FPGAs, but can also model the effect of two different microarchitectures commonly used in C-to-hardware compilers, including pipelined architectures with modulo scheduling. The estimations have been validated against actual hardware generated by two current HLS tools.





Similar content being viewed by others
Notes
\(\text {Spatial factor}=\frac{\text {Exec.-Time (Sequential)}}{\text {Exec.-Time (Blockwise)}}\).
\(\text {Pipeline factor}=\frac{\text {Exec.-Time (Blockwise)}}{\text {Exec.-Time (Pipeline)}}\).
References
Canis A, Choi J, Fort B, Lian R, Huang Q, Calagar N, Gort M, Qin JJ, Aldham M, Czajkowski T, Brown S, Anderson J (2013) From software to accelerators with LegUp high-level synthesis. In: 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). doi:10.1109/CASES.2013.6662524
Canis AC (2015) Legup: open-source high-level synthesis research framework. PhD thesis, University of Toronto
Cong J, Liu B, Neuendorffer S, Noguera J, Vissers K, Zhang Z (2011) High-level synthesis for fpgas: from prototyping to deployment. IEEE Trans Comput Aided Des Integr Circuits Syst 30(4):473–491
da Silva B, Braeken A, D’Hollander EH, Touhafi A (2013) Performance modeling for fpgas: extending the roofline model with high-level synthesis tools. Int J Reconfig Comput 2013:7:7–7:7. doi:10.1155/2013/428078
De Micheli G (1994) Synthesis and optimization of digital circuits. McGraw-Hill, New York
Hara Y, Tomiyama H, Honda S, Takada H (2009) Proposal and quantitative analysis of the CHStone benchmark program suite for practical C-based high-level synthesis. J Inf Process. doi:10.2197/ipsjjip.17.242
Holland B, Nagarajan K, George AD (2009) Rat: Rc amenability test for rapid performance prediction. ACM Trans Reconfig Technol Syst 1(4):22:1–22:31. doi:10.1145/1462586.1462591
Huthmann J, Liebig B, Oppermann J, Koch A (2013) Hardware/software co-compilation with the nymble system. In: 2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)
Lange H, Wink T, Koch A (2011) MARC II: a parametrized speculative multi-ported memory subsystem for reconfigurable computers. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), 2011. IEEE, pp 1–6
Lattner C, Adve V (2004) LLVM: a compilation framework for lifelong program analysis and transformation. In: International Symposium on Code Generation and Optimization, CGO. doi:10.1109/CGO.2004.1281665
Nane R, Sima VM, Pilato C et al (2016) A survey and evaluation of fpga high-level synthesis tools. IEEE Trans Comput Aided Des Integr Circuits Syst. doi:10.1109/TCAD.2015.2513673
Oppermann J, Koch A, Reuter-Oppermann M, Sinnen O (2016) ILP-based modulo scheduling for high-level synthesis. In: Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES ’16. ACM, pp 1:1–1:10. doi:10.1145/2968455.2968512
Pangrle BM, Gajski DD (1987) Design tools for intelligent silicon compilation. IEEE Trans Comput Aided Des Integr Circuits Syst 6(6):1098–1112. doi:10.1109/TCAD.1987.1270350
Park J, Diniz PC, Shayee KRS (2004) Performance and area modeling of complete fpga designs in the presence of loop transformations. IEEE Trans Comput 53(11):1420–1435. doi:10.1109/TC.2004.101
Putnam A, Caulfield AM, Chung ES, Chiou D, Constantinides K, Demme J, Esmaeilzadeh H, Fowers J, Gopal GP, Gray J, Haselman M, Hauck S, Heil S, Hormati A, Kim JY, Lanka S, Larus JR, Peterson E, Pope S, Smith A, Thong J, Xiao PY, Burger D (2014) A reconfigurable fabric for accelerating large-scale datacenter services. In: ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), IEEE Computer Society, Minneapolis, MN, USA, pp 13–24
Reagen B, Adolf R, Shao YS, Wei GY, Brooks D (2014) MachSuite: benchmarks for accelerator design and customized architectures. doi:10.1109/IISWC.2014.6983050
Sotomayor R, Sanchez LM, Blas JG, Calderon A, Fernandez J (2015) Aki: automatic kernel identification and annotation tool based on C++ attributes. In: Trustcom/BigDataSE/ISPA, 2015 IEEE, vol 3
Wang Z, He B, Zhang W, Jiang S (2016) A performance analysis framework for optimizing OpenCL applications on FPGAs. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp 114–125. doi:10.1109/HPCA.2016.7446058
Xilinx Inc. Zynq-7000 all programmable soc. http://www.xilinx.com/products/silicon-devices/soc/zynq-7000.html
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was partially funded by the EU in the FP7 research project REPARA (ICT-609666).
Rights and permissions
About this article
Cite this article
Oppermann, J., Sommer, L. & Koch, A. SpExSim: assessing kernel suitability for C-based high-level hardware synthesis. J Supercomput 75, 4062–4077 (2019). https://doi.org/10.1007/s11227-017-2101-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-017-2101-z