Ns3 Projects for B.E/B.Tech M.E/M.Tech PhD Scholars. Phone-Number:9790238391 E-mail: ns3simulation@gmail.com

Home » Accelerating LINPACK with MPI-OpenCL on Clusters of Multi-GPU Nodes

Accelerating LINPACK with MPI-OpenCL on Clusters of Multi-GPU Nodes

October 10, 2015October 10, 2015 by NS3 Simulation Projects

OpenCL is an open standard to write parallel applications for heterogeneous computing systems. Since its usage is restricted to a single operating system instance, programmers need to use a mix of OpenCL and MPI to program a heterogeneous cluster. In this paper, we introduce an MPI-OpenCL implementation of the LINPACK benchmark for a cluster with multi-GPU nodes. The LINPACK benchmark is one of the most widely used benchmark applications for evaluating high performance computing systems.

Our implementation is based on High Performance LINPACK (HPL) and uses the blocked LU decomposition algorithm. We address that optimizations aimed at reducing the overhead of CPUs are necessary to overcome the performance gap between the CPUs and the multiple GPUs. Our LINPACK implementation achieves 93.69 Tflops (46 percent of the theoretical peak) on the target cluster with 49 nodes, each node containing two eight-core CPUs and four GPUs.

Technology	Ph.D	M.Tech	M.S
Wireless Sensor Networks	4	20	11
Security	3	26	15
Mobile computing	7	30	16
Cognitive Radio Network	6	39	14
IOT	8	21	15
LTE	4	23	18
Manet	2	29	25
Open Flow	2	18	28
SDN	12	16	24
VANET	10	34	14
Vide Streaming	3	6	9
WBAN	11	15	19
Vertical Handover	4	10	18
D-D communication	2	12	6
Attacks	30	57	39
WIFI	3	5	8
Bluetooth	2	5	4
Social sensor network	6	11	24
Under water sensor network	7	17	11
Multicast	1	18	5
5g,4g	10	38	12
IPv4,IPV6	15	40	14