0% found this document useful (0 votes)

32 views5 pages

Fpga Arm Processor Based Supercomputiing

This document describes an FPGA and ARM processor based supercomputing system composed of five Zynq SoCs compute-nodes. It proposes using the Zynq System on Chip, which combines an ARM processor with FPGA fabric, to build a low-cost and low-power supercomputing system. An FIR filter application was used to test the performance of the system with and without FPGA accelerators. The results showed that the ARM supercomputer with FPGA accelerators was 8.56 times faster than a similar system without accelerators.

Uploaded by

aksavar2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views5 pages

Fpga Arm Processor Based Supercomputiing

Uploaded by

aksavar2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2018 International Conference on Computing, Mathematics and Engineering Technologies – iCoMET 2018

FPGA and ARM Processor based Supercomputing

Wasim Akram1, Tassadaq Hussain2, Eduard Ayguade3
1, 2
Riphah International University Islamabad
2
Unal Color of Education Research and Development Islamabad
3
Barcelona Supercomputing Center, Barcelona, Spain
1
Wasimakram811@hotmail.com, 2tassadaq@ucerd.com

Abstract—The low-cost and low-power heterogeneous ARM based-Server is favorable for applications that need
architecture platform such as Xilinx Zynq SoC provides an high throughput instead of computing power.
extensive combination of ARM multi-core processor with FPGA
accelerator for acceleration of high performance computing Zynq SoC device [5] offers a heterogeneous computing
applications. In this paper, we proposed an FPGA and ARM platforms built by Xilinx. It combines a multi-core ARM
processor based supercomputer system composed of five Zynq CPU with FPGA accelerator. The purpose of FPGA
SoCs compute-nodes. The design system uses message passing accelerator integration into SoC for low power acceleration.
interface libraries for communication between compute-nodes
The ARM CPU and FPGA accelerator are connected together
while AXI4-stream interfaces between ARM processor and
FPGA inside a compute-node. An FIR filter application is used by using a high performance and high bandwidth AXI/ACP
to test the performance of the system with and without FPGA set of interfaces, that’s allow interfacing with main memory
accelerators. The results show that the performance of ARM [6] [7].
based supercomputer with FPGA accelerators is 8.56 times
higher than similar system without FPGA accelerators. In this proposed work, we designed an FPGA and ARM
processor based supercomputer system. The system
Keywords—Hetrogeneous, Zynq Soc, Supercomputer composed of low-cost and low power Zybo boards [8] having
Zynq SoCs. In the design system, the FPGA handle the
I. INTRODUCTION compute-intensive portion of a high performance application
With the improvement in multi-core processor and increased the computation power of ARM CPU. The
technology, the demand for application performance also Finite Impulse Response (FIR) filter is used as a test
increased. Delivering high performance to an application application on the system to evaluate the computational
requires more processing speed from multicore-processors capability of ARM processor with and without FPGA
which increases the power consumption. As the power has accelerator.
become the main metric for modern high performance
computing, the researcher, and system architect are proposing This paper is organized as follows: Section II gives a detail
heterogeneous multi-core processing system that combines a description of the related works in the field of heterogeneous
multi-core processor with hardware accelerators or co- architecture computing of deploying FPGA as a hardware
processor. These accelerators improve the performance of a accelerator with the conventional multi-core processor and
compute-intensive application by executing a certain task. with the embedded processor. Section III discusses the
Over the past few decades, FPGA-based accelerators give the system architecture and design includes both hardware and
considerable improvement in performance and power software. In Section IV the results and discussion of research
efficiency make them attractive to high performance works are presented. The conclusion and future works are
computing world. The flexibility of achieving higher presented in Section V and VI followed by references.
performance per watt prove that FPGA is capable to compete
for both superscalar and GPU accelerators, especially for
II. RELATED WORK
high performance computing applications [1] [2] .
The heterogeneous architecture platforms provide a
The embedded processor's ARM based-servers [3] has foundation to FPGA to integrate with other conventional
gained popularity in academia and industry due to low-cost processors for the acceleration of high performance
and low power consumption compared to conventional applications [9] [10]. The following research work promises
processors. The computational capability of ARM embedded an opportunity for FPGA-based accelerators with others
processors is not like other x86 architectures processors in computing units. Cray XD1 supercomputer [11], The
server environments but according to a recent research [4], Berkeley Emulation Engine 2 BEE2 [12] and Maxwell
their market share will be expected to rise 25% in 2020. project [13] used FPGA as the only computing elements in
their supercomputing cluster for application acceleration. In

2008, the Convey Computer Corporation [14] designed
heterogonous computing platform that combines one or more
x86 processors with FPGA-based application accelerator. The
Convey Hybrid-core HC-1 was the first product consist of
Intel Xeon host processor and Xilinx Vertex FPGAs as a
coprocessor. Tsoi. K. H [15] presented a heterogeneous
computer cluster known as Axle, consist of AMD Phenom
Quad-core CPU, NVidia GPU and a Xilinx Virtex-5 FPGA
attached on a PCI bus as an accelerator for the simulation
process of the N-body algorithm. George. A. et al. [16]
AXI
implemented a machine called Novo-G supercomputer made
from 24 compute-nodes with quad-core Xeon processor
mounted on two PCI x8 PROCstar-III accelerator boards.
Each board comprises four Startix-III E260 FPGAs from
Altera.
Moreover, the following research works proposing
FPGA-based accelerators with embedded processors. Lin Z. Fig. 1: FPGA and ARM processor based Supercomputer Architecture using
[17] demonstrated an FPGA-based Hadoop cluster made of 8 Zynq SoC
computing nodes of Xilinx Zynq SoC called ZCluster. The .
aim of ZCluster to build a Hadoop cluster to increase the 1) Master-node: The master node used in the building
computing capabilities of ARM processors with the used of system is an Intel Core-i7 Xeon CPU running at 3.2GHz
reconfigurable hardware accelerators. Moorthy P. [18] built frequency with 4GB DDR3 memory. The 1TB hard disk
up a cluster of 32-nodes of Xilinx Zynq SoC chips. The main drive is used as a media storage for master node. The master
objective of the 32-nodes cluster to assess the energy node is the main controlling component of the system to
efficiency of hybrid SoC for fast mapping of parallel graph establish communication and dividing of tasks between
algorithms like neural network simulation. Bai X. et al. [19] compute-nodes.
designed a cluster of 48-compute-nodes and each computes
2) Compute-nodes: Each compute-nodes of Zybo board
node composed of Xilinx Zynq SoC chips. The hybrid
has Xilinx Zynq-7000 SoC (system on chip) which
architectures provide a platform for ARM CPU merge with
incorporate a dual-core ARM embedded processor running
FPGA reconfigurable hardware. A non-subtraction
on 650MHz and FPGA fabric of Xilinx 7 logic series with
Montgomery and Chines Reminder Theorem algorithms are
implemented to test the performance of hybrid architectures 512MB DDR3 main memory, and 240KB of RAM with 4.4k
platforms. logic slices and 80 DSP slices. The ARM CPU and FPGA
The above describes research works shows that researcher hardware accelerator are connected together by using a high
and system architects have made a great contribution to used performance and high bandwidth AXI4-stream interfaces as
FPGA-based accelerators with conventional processors as shown in Figure 1: The AXI4-stream interfaces are divided
well as embedded processors for processing of high into following two groups:
performance computing applications. • AXI4-stream Master interfaces connect ARM CPU
on the salves of FPGA fabric for read/write
operations. The two 32-bits master interfaces are
III. SUPERCOMPUTER ARCHITECTURE AND DESIGN GP0 and GP1.
In this section, we describe the building of FPGA and • AXI4-stream Salve interfaces connect FPGA master
ARM based supercomputer system and its operating into CPU salve to read/ write into the main memory
mechanism. The physical layout of the system required of processing system. High performance (HP0) and
different hardware interfaces and related software accelerator coherency port (ACP) are the example of
configuration. The scalability of the system increased by salve interfaces.
deploy multiple switches and compute-nodes. Figure 1 shows The two types of compute-nodes are used in the architecture
a block diagram of FPGA and ARM processor based of supercomputer system.
supercomputer architecture of Xilinx Zynq SoCs. This a) Compute-node without FPGA: This type of
section is further divided into four sub-sections as follow: compute-node only used ARM processor for the
computations and processing of data without the used of
A. Processing System
FPGA accelerator. The FPGA accelerator is disabled to
The FPGA and ARM processor based supercomputer perform any computations.
composed of five compute-nodes of Zybo board and Intel
Xeon server connected through an 8-port 10/100Mbps b) Compute-node with FPGA: This type of compute-
Ethernet Switch. The physical design of the supercomputer is node used FPGA accelerator to increase the computational
shown in Figure 2. The detail of processing system is given in capability of ARM processor by processing of the data-
the following two sub-sections. intesive portion of an application. A customized FPGA
node using SSH server commands to log onto every compute
node using their hostnames and IP addresses.
4) Network File System: The Network File System NFS
utilize the TCP/UDP internet protocols to distribute compiled
applications, packages, and libraries or data across the
supercomputer. We installed NFS server version on the
master node and client version on all five compute-nodes.
The machine files are accessible and available to all nodes at
the same time.
C. Supercomputer Configuration
After installing all relevant software and packages, our
b system now operates like a real production supercomputer.
a The DHCP server configuration provides the IP addresses, in
the 192.168.10.0/24 subnets. The SSH server on the master is
now authorized to access every compute node on the network
by using their hostname and IP address. SSH public Keys are
distributed on every node to grant permission without
Fig. 2: Physical Design of the Supercomputer (a) Five Compute-nodes of password authentication. This configuration made an
Zybo boards (b) Master-node application program to communicate across the
supercomputer without having specified the username and
accelerator is designed by using high-level synthesis and password on every connection. After making all necessary
designing tools. configurations of the supercomputer a common machine file
directory is created on master node is a root user and
exported that directory to all five compute-nodes. The
B. System Software compute-nodes mount the same directory in their local
In this section, all the required software and packages are location and a single folder is shared between master and all
installed on master and all compute-nodes. This section is compute-nodes.
further subdivided into four sub-sections.
1) Operating Systems: Linux based operating system D. Application Programming Software
such as 64-bit Ubuntu LTS 14.0 installed on master node to This section covers the parallel programming models and
manage the resources sharing across the supercomputer designing tool for our design supercomputer system. The
system. The Xillinux operating systems [20] is used by all parallel programming models are used to overcome the
compute-nodes in our system designing. This operating complexity that is between hardware architecture and
system is a Linux distribution flavor for Ubuntu LTS 12.04. application software. Our system support MPI [22], MPICH
We used the number of booting stages to boot Zybo board. In [23] and emerging models like OpenCL [24]. This section is
stage 0 we load boot image file of Xillinux into SD Card further categorized into following two sub-sections.
which runs the primary CPU and initialize the first stage 1) Message Passing interface: Multiples nodes make our
bootloader (FSBL) to configure the Clk, DDR and I/Os of the design supercomputer a distributed memory system
processing system. In addition to that, we add our bitstream architecture. The parallel programming models are used to
file of hardware accelerator and invoked the second stage accomplish the desired parallelism across the system. So, for
loader to load the test application program into main memory. our design, we installed OpenMPI and MPICH on five all
We do this process for all compute-nodes of the system. compute-nodes and master nodes. It provides the fast node to
2) Dynamic Host Configurationprotocols: Every node messaging passing protocols and daemon-based process
compute-nodes, the master node, and Ethernet Switch have startup/control for supercomputing functioning. After
an IP address. A DHCP server is used to assign static IP installation, we execute an FIR filter C++ program parallel on
addresses to the network. DHCP server generates specific IP multiple nodes with mpirun command mpirun -np 2 --user
by using MAC address of every node in the network. Every hosts ./exe. In the command, np specify the number of cores
node in the network has their hostname and IP address. A per node while the user is a file containing host node name
DNS server is also enabled with dynamic (DDNS) to make and IP addresses. Figure 3 shows the inter-nodes and intra-
easy for the master node to access compute node by their nodes communication across the supercomputer.
hostname without using of host IP every time. 2) Vivado HLS Tool: We used Xilinx Vivado 2015.4 tool
3) Secure Shell: The SSH server [21] is a secure data to generate bitstream file of the customized hardware
transfer protocol to log onto remote system utilizing TCP accelerator for FIR filter application acceleration. The
internet protocol. The standard TCP 22 port has been hardware accelerator includes one AXI4-stream master
designated for SSH server to communicate. The SSH server interface bus and one AXI4-stream high-performance slave
is installed on master and all five compute-nodes. The master interface bus, a customize sample source IP and FIR
TABLE 2. Supercomputer Five Nodes Performance

Clock Cycles Speedup

Data Set With FPGA
ARM ARM + FPGA

1GB 180,915,420 21,111,494 8.56x

B. Supercomputer system Performance

This experimental test presents the performance of ARM
processors based supercomputer with and without FPGA
accelerators. The result is tabulated in Table 2. The result
shows that executing FIR on the supercomputer of five
compute-nodes of ARM processors with FPGA accelerators
gains performance of 8.56 times higher than the
supercomputer of five compute-nodes of ARM processors
without FPGA accelerators. The improvement in
performance as compared to Table 1 is smaller due to MPI
Fig. 3: Communication across the Supercomputer communication overheads across the system. The
communication overhead between nodes increased with the
compiler. The sample source generates the required samples increasing of compute-nodes.
for FIR compiler. The sample generator writes the data to
FIR compiler through FIFO. The data from FIR is written to
main memory through DMA engine by using high- V. CONCLUSION AND FUTURE WORK
performance salve AXI bus. The ARM CPU read the data This paper proposed the implementation of FPGA and
from FPGA through AXI master bus and reconfigure the ARM processor based supercomputing using Zynq SoC
DMA engine for next packet of data. devices. The system is able to take advantage of parallelism
IV. RESULT AND DISCUSSION by executing high performance computing applications. The
using of FIR filter application on system shows that the
In this section, we perform a series of test to measure the computational capability of ARM processor is increased by
performance of ARM processor with and without FPGA integrating FPGA accelerator to execute the compute-
accelerator for single compute node and five compute-nodes intensive portion of the application. The supercomputer
of the supercomputer. We used a low pass 32-tap FIR filter performance of five compute-nodes of ARM processors with
[25] as a test application. The FIR application uses FPGA accelerator is 8.56 times higher than the performance
1GigaByte of data set. The clock frequency for ARM of same numbers of nodes without FPGA accelerators. The
processor is 650MHz and FPGA is 200MHz. The section is FPGA and ARM based supercomputer system shows that
further subdivided into two sub-sections: the single compute with the advancement of processor technologies will decrease
node performance and supercomputer system performance. the gap between embedded processor and conventional
processor in future high performance supercomputing.
A. Performance of Single Compute Node
The first experimental test describes the performance of In future, for our supercomputer system, the high-level
ARM processor with and without FPGA accelerator for a synthesis tools will be used which support OpenCL
single compute node of the supercomputer. The result is computing language for generating of bitstream files for
tabulated in Table 1. The result shows that while executing customized hardware accelerators from the standard C code.
FIR application on single compute node of ARM processor The implementation of OpenCL on the supercomputer to
with FPGA accelerator gains speedup of 7.55 times higher fully analyze the parallelism of heterogeneous architecture
than ARM processor without FPGA accelerator. platform in order to achieve higher performance with low
power consumption.
Table 1. Single Node Performance

ACKNOWLEDGMENT
Clock Cycles
Speedup The research leading to these results has received funding
Data Set With FPGA
ARM ARM + FPGA from the Unal Color of Education Research and Development
(UCERD) Private Limited Islamabad.
1GB 461,334,321 61,223,331 7.55x
REFERENCES
[1] S. Amin, T. Hussain, and U. Zabit, “FPGA Based Processing of [13] R. Baxter et al., “Maxwell - A 64 FPGA supercomputer,” Proc. -
Speckle Affected Self-Mixing Interferometric Signals,” 2016 Int. 2007 NASA/ESA Conf. Adapt. Hardw. Syst. AHS-2007, no. August,
Conf. Front. Inf. Technol., pp. 292–296, 2016. pp. 287–294, 2007.
[2] T. Hussain, M. Pericas, N. Navarro, and E. Ayguade, [14] B. Klauer, “The Convey Hybrid-Core Architecture,” in High-
“Implementation of a reverse time migration kernel using the HCE Performance Computing Using FPGAs, vol. 375, 2010, pp. 431–
high level synthesis tool,” 2011 Int. Conf. Field-Programmable 451.
Technol. FPT 2011, pp. 2–9, 2011. [15] Kuen Hung Tsoi and Wayne Luk, “Axel: A Heterogeneous Cluster
[3] “MACOM Announces Sampling of X-Gene® 3 Server-on-a- with FPGAs and GPUs,” 18th Annu. ACM/SIGDA Int. Symp. F.
Chip® Solution | AppliedMicro.” [Online]. Available: Program. Gate Arrays, pp. 115–124, 2010.
https://www.apm.com/news/macom-announces-sampling-of-x- [16] A. D. George and G. Stitt, “Novo-G : A View at the HPC
gene-3-server-on-a-chip-solution/. [Accessed: 27-Nov-2017]. Crossroads for Scientific Computing .,” no. January, 2010.
[4] “Worldwide x86 and ARM Server-Class Microprocessor Forecast, [17] Z. Lin and P. Chow, “ZCluster: A Zynq-based Hadoop cluster,”
2016–2020.” FPT 2013 - Proc. 2013 Int. Conf. F. Program. Technol., pp. 450–
[5] “Zynq-7000 All Programmable SoC.” [Online]. Available: 453, 2013.
https://www.xilinx.com/products/silicon-devices/soc/zynq- [18] P. Moorthy and N. Kapre, “Zedwulf: Power-performance tradeoffs
7000.html. [Accessed: 21-Nov-2017]. of a 32-node Zynq SoC cluster,” Proc. - 2015 IEEE 23rd Annu.
[6] T. Hussain, M. Shafiq, M. Pericàs, N. Navarro, and E. Ayguadé, Int. Symp. Field-Programmable Cust. Comput. Mach. FCCM
“PPMC: A programmable pattern based memory controller,” Lect. 2015, no. 3, pp. 68–75, 2015.
Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. [19] X. Bai, L. Jiang, Q. Dai, J. Yang, and J. Tan, “Acceleration of RSA
Lect. Notes Bioinformatics), vol. 7199 LNCS, pp. 89–101, 2012. Processes based on Hybrid ARM-FPGA Cluster,” 2017.
[7] T. Hussain, O. Palomar, O. Unsal, A. Cristal, E. Ayguade, and M. [20] “Xillinux: A Linux distribution for Zedboard, ZyBo, MicroZed and
Valero, “Advanced Pattern based Memory Controller for FPGA SocKit | xillybus.com.” [Online]. Available:
based HPC applications,” Proc. 2014 Int. Conf. High Perform. http://xillybus.com/xillinux. [Accessed: 21-Nov-2017].
Comput. Simulation, HPCS 2014, pp. 287–294, 2014. [21] “SSH Server | SSH.COM.” [Online]. Available:
[8] “Zybo Zynq-7000 ARM/FPGA SoC Trainer Board (LIMITED https://www.ssh.com/ssh/server. [Accessed: 21-Nov-2017].
TIME)>> see Zybo Z7-10 for replacement - Digilent.” [22] “Open MPI: Open Source High Performance Computing.”
[Online]. Available: http://store.digilentinc.com/zybo-zynq-7000- [Online]. Available: https://www.open-mpi.org/. [Accessed: 21-
arm-fpga-soc-trainer-board/. [Accessed: 23-Nov-2017]. Nov-2017].
[9] T. Hussain, “Memory resources aware run-time automated [23] “MPICH | High-Performance Portable MPI.” [Online]. Available:
scheduling policy for multi-core systems,” Microprocess. https://www.mpich.org/. [Accessed: 21-Nov-2017].
Microsyst., vol. 57, pp. 1–24, 2018. [24] “OpenCL Overview - The Khronos Group Inc.” [Online].
[10] T. Hussain, “A novel hardware support for heterogeneous multi- Available: https://www.khronos.org/opencl/. [Accessed: 23-Nov-
core memory system,” J. Parallel Distrib. Comput., vol. 106, pp. 2017].
31–49, 2017. [25] “FIR Filter Design, Software and Examples.” [Online]. Available:
[11] C. Xd, “Cray XD1 Supercomputer.” http://www.iowahills.com/5FIRFiltersPage.html. [Accessed: 27-
[12] C. Chang, J. Wawrzynek, and R. W. Brodersen, “BEE2: A high- Nov-2017].
end reconfigurable computing system,” IEEE Des. Test Comput.,
vol. 22, no. 2, pp. 114–125, 2005.

anti-spoofing
No ratings yet
anti-spoofing
58 pages
Client InterfacesV3.14andV5.9
No ratings yet
Client InterfacesV3.14andV5.9
312 pages
Design of FPGA-Based Computing Systems with OpenCL 1st Edition Hasitha Muthumala Waidyasooriya - The complete ebook version is now available for download
100% (1)
Design of FPGA-Based Computing Systems with OpenCL 1st Edition Hasitha Muthumala Waidyasooriya - The complete ebook version is now available for download
62 pages
Lec09 Rapid Prototyping (I) - Integration of ARM and FPGA
No ratings yet
Lec09 Rapid Prototyping (I) - Integration of ARM and FPGA
51 pages
A Practical Application of ARM Cortex M3
No ratings yet
A Practical Application of ARM Cortex M3
19 pages
Design of FPGA-Based Computing Systems with OpenCL 1st Edition Hasitha Muthumala Waidyasooriya download pdf
100% (1)
Design of FPGA-Based Computing Systems with OpenCL 1st Edition Hasitha Muthumala Waidyasooriya download pdf
55 pages
Final (IR)
No ratings yet
Final (IR)
42 pages
FPGA Embedded Processors Revealing True
No ratings yet
FPGA Embedded Processors Revealing True
37 pages
Thesis HardBound
No ratings yet
Thesis HardBound
227 pages
Where Can Buy Design of FPGA-Based Computing Systems With OpenCL 1st Edition Hasitha Muthumala Waidyasooriya Ebook With Cheap Price
100% (3)
Where Can Buy Design of FPGA-Based Computing Systems With OpenCL 1st Edition Hasitha Muthumala Waidyasooriya Ebook With Cheap Price
52 pages
MyLabDesk HowToGuide
No ratings yet
MyLabDesk HowToGuide
67 pages
Embedded Systems Basics - Lecture Notes - DrJinesh
No ratings yet
Embedded Systems Basics - Lecture Notes - DrJinesh
114 pages
3609335
No ratings yet
3609335
31 pages
A_Survey_on_FPGA-Based_Heterogeneous_Clusters_Architectures
No ratings yet
A_Survey_on_FPGA-Based_Heterogeneous_Clusters_Architectures
28 pages
3107953
No ratings yet
3107953
25 pages
DML Dynamic Partial Reconfiguration With Scalable Task Scheduling for Multi-Applications on FPGAs
No ratings yet
DML Dynamic Partial Reconfiguration With Scalable Task Scheduling for Multi-Applications on FPGAs
15 pages
Armhpc SC
No ratings yet
Armhpc SC
37 pages
1 Introduction
No ratings yet
1 Introduction
41 pages
Atp Document - Drdo
No ratings yet
Atp Document - Drdo
23 pages
Biostar A780L3C AMD Motherboard Setup Manual PDF
0% (1)
Biostar A780L3C AMD Motherboard Setup Manual PDF
49 pages
Arm Based Soc Physical Design
No ratings yet
Arm Based Soc Physical Design
14 pages
Embedded System
No ratings yet
Embedded System
17 pages
NTU ch3
No ratings yet
NTU ch3
11 pages
Datasheet bq76200
No ratings yet
Datasheet bq76200
29 pages
CAPI
No ratings yet
CAPI
7 pages
Xiv 11.5.1 Xcli
No ratings yet
Xiv 11.5.1 Xcli
704 pages
Embedded System Design Using FPGAs
No ratings yet
Embedded System Design Using FPGAs
15 pages
Embedded Operating Systems
No ratings yet
Embedded Operating Systems
26 pages
RF Coexistence - ESP32 - — ESP-IDF Programming Guide latest docu
No ratings yet
RF Coexistence - ESP32 - — ESP-IDF Programming Guide latest docu
6 pages
Data Processing On Fpgas
No ratings yet
Data Processing On Fpgas
12 pages
unit 1 it3501
No ratings yet
unit 1 it3501
10 pages
Ca Lab Manual
No ratings yet
Ca Lab Manual
38 pages
M-1 Introduction
No ratings yet
M-1 Introduction
43 pages
ARM-Based Embedded System Platform and Its Portabi
No ratings yet
ARM-Based Embedded System Platform and Its Portabi
13 pages
04_abstract (1)
No ratings yet
04_abstract (1)
40 pages
Hard and Soft Embedded FPGA Processor Systems Design: Design Considerations and Performance Comparisons
No ratings yet
Hard and Soft Embedded FPGA Processor Systems Design: Design Considerations and Performance Comparisons
22 pages
Programming and Synthesis For Software-Defined FPGA Acceleration - Status and Future Prospects
No ratings yet
Programming and Synthesis For Software-Defined FPGA Acceleration - Status and Future Prospects
39 pages
A Study of FPGA-based System-on-Chip Designs
No ratings yet
A Study of FPGA-based System-on-Chip Designs
12 pages
FPGA Based
No ratings yet
FPGA Based
7 pages
Cpu Gpu System
No ratings yet
Cpu Gpu System
26 pages
Sy0-701 - Lesson 12
No ratings yet
Sy0-701 - Lesson 12
46 pages
Introduction To Embedded Systems: Semicon Solutions
No ratings yet
Introduction To Embedded Systems: Semicon Solutions
17 pages
Maes Mid Lecture 01 v4
No ratings yet
Maes Mid Lecture 01 v4
41 pages
cs_fundamental_interview_ques
No ratings yet
cs_fundamental_interview_ques
3 pages
Dokumen - Tips Embedded Systems
No ratings yet
Dokumen - Tips Embedded Systems
26 pages
Class Viii Computer
No ratings yet
Class Viii Computer
3 pages
Cryptolog 63
No ratings yet
Cryptolog 63
32 pages
PYNQ Productivity With Python
100% (1)
PYNQ Productivity With Python
67 pages
CCE 2018 Paper 53
No ratings yet
CCE 2018 Paper 53
6 pages
Intel Whitepaper - FPGA Adaptive Software Debug and Performance Analysis
No ratings yet
Intel Whitepaper - FPGA Adaptive Software Debug and Performance Analysis
7 pages
dapnia-05-105
No ratings yet
dapnia-05-105
5 pages
Is 901SP1 IIR 02 Installation en
No ratings yet
Is 901SP1 IIR 02 Installation en
35 pages
Hardware-Software Debugging Techniques For Reconfigurable Systems-on-Chip
No ratings yet
Hardware-Software Debugging Techniques For Reconfigurable Systems-on-Chip
6 pages
Manual Ingles Mindray 2800
No ratings yet
Manual Ingles Mindray 2800
108 pages
Jimaging 05 00016
No ratings yet
Jimaging 05 00016
22 pages
Lab 6 4 1 Basic Inter VLAN Routing Topol
No ratings yet
Lab 6 4 1 Basic Inter VLAN Routing Topol
8 pages
Swift-Semi Auto Download Procedure
100% (2)
Swift-Semi Auto Download Procedure
1 page
Yiu - Cortex-M Processor Based System Prototyping On FPGA
No ratings yet
Yiu - Cortex-M Processor Based System Prototyping On FPGA
7 pages
Creating HWSW Co-Designed MPSoPCs From High Level Programming Models
No ratings yet
Creating HWSW Co-Designed MPSoPCs From High Level Programming Models
7 pages
Week 13 Summary (1)
No ratings yet
Week 13 Summary (1)
11 pages
Image Hardware PDF
No ratings yet
Image Hardware PDF
19 pages
Embedded Notes
No ratings yet
Embedded Notes
14 pages
Week 13 Summary
No ratings yet
Week 13 Summary
9 pages
GPU Versus FPGA For High Productivity Computing: Imperial College London, Electrical and Electronic Engineering, London
No ratings yet
GPU Versus FPGA For High Productivity Computing: Imperial College London, Electrical and Electronic Engineering, London
6 pages
Week 12 Summary
No ratings yet
Week 12 Summary
7 pages
03 Intro Embedded Systems Processors
No ratings yet
03 Intro Embedded Systems Processors
16 pages
Lect 1
No ratings yet
Lect 1
34 pages
CMMP AS - M3 - Brief Instr - 2021 05c - 8156654g1
No ratings yet
CMMP AS - M3 - Brief Instr - 2021 05c - 8156654g1
4 pages
Applications Enabled by FPGA-Based Technology
No ratings yet
Applications Enabled by FPGA-Based Technology
4 pages
Development of Soft-Core Processor System On Fpga
No ratings yet
Development of Soft-Core Processor System On Fpga
8 pages
Core-Iii-558-32511301 - Oc-Electronic Circuits-17-12-2020
No ratings yet
Core-Iii-558-32511301 - Oc-Electronic Circuits-17-12-2020
2 pages
Zynq Ultrascale Plus Product Brief
No ratings yet
Zynq Ultrascale Plus Product Brief
6 pages
ThinkStation_P2_Tower_30FS000ELM
No ratings yet
ThinkStation_P2_Tower_30FS000ELM
2 pages
Embedded Processors On FPGA: Soft Vs Hard: Vivek Jayakrishnan
No ratings yet
Embedded Processors On FPGA: Soft Vs Hard: Vivek Jayakrishnan
8 pages
Implementation of Uart Using Systemc and Fpga Based Co-Design Methodology
No ratings yet
Implementation of Uart Using Systemc and Fpga Based Co-Design Methodology
7 pages
Tutorial 2 - Solutions
No ratings yet
Tutorial 2 - Solutions
29 pages
FPGAs Memory Synchronization and Performance Evaluation Using The Open Computing Language Framework
No ratings yet
FPGAs Memory Synchronization and Performance Evaluation Using The Open Computing Language Framework
8 pages
Comparison of Processing Performance and Architectural Efficiency Metrics For Fpgas and Gpus in 3D Ultrasound Computer Tomography
No ratings yet
Comparison of Processing Performance and Architectural Efficiency Metrics For Fpgas and Gpus in 3D Ultrasound Computer Tomography
7 pages
Twido Programmable Controller: Presentation
No ratings yet
Twido Programmable Controller: Presentation
10 pages
Touch Screen Thermostats TX 410
No ratings yet
Touch Screen Thermostats TX 410
2 pages
Ieee Fpga
No ratings yet
Ieee Fpga
3 pages
Noise: Optical Communication Lecture Notes
No ratings yet
Noise: Optical Communication Lecture Notes
20 pages
Implementation and Optimization of Embedded Image Processing System
No ratings yet
Implementation and Optimization of Embedded Image Processing System
6 pages
CCNA 200-301 Official Cert Guide, Volume 2-236
No ratings yet
CCNA 200-301 Official Cert Guide, Volume 2-236
3 pages
WinCon4.1 Installation Guide
No ratings yet
WinCon4.1 Installation Guide
3 pages
Multicore Fpga
No ratings yet
Multicore Fpga
1 page
Embedded Systems: Thanos Stathopoulos CS239 Spring 03
No ratings yet
Embedded Systems: Thanos Stathopoulos CS239 Spring 03
29 pages
Cortex-A Architecture and System Design: Definitive Reference for Developers and Engineers
From Everand
Cortex-A Architecture and System Design: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
GameCube Architecture: Architecture of Consoles: A Practical Analysis, #10
From Everand
GameCube Architecture: Architecture of Consoles: A Practical Analysis, #10
Rodrigo Copetti
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Fpga Arm Processor Based Supercomputiing

Uploaded by

Fpga Arm Processor Based Supercomputiing

Uploaded by

2018 International Conference on Computing, Mathematics and Engineering Technologies – iCoMET 2018

FPGA and ARM Processor based Supercomputing

978-1-5386-1370-2/18/$31.00 ©2018 IEEE

Clock Cycles Speedup

1GB 180,915,420 21,111,494 8.56x

B. Supercomputer system Performance

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.