A General Framework for Accelerating Swarm Intelligence Algorithms on FPGAs, GPUs and Multi-core CPUs

Dalin Li; Lan Huang; Kangping Wang; Wei Pang; You Zhou; Rui Zhang

doi:10.1109/ACCESS.2018.2882455

A General Framework for Accelerating Swarm Intelligence Algorithms on FPGAs, GPUs and Multi-core CPUs

Dalin Li, Lan Huang, Kangping Wang^* (Corresponding Author), Wei Pang, You Zhou, Rui Zhang

^*Corresponding author for this work

Computing Science

Jilin University

Research output: Contribution to journal › Article › peer-review

7 Citations (Scopus)

11 Downloads (Pure)

Abstract

Swarm intelligence algorithms (SIAs) have demonstrated excellent performance when solving optimization problems including many real-world problems. However, because of their expensive computational cost for some complex problems, SIAs need to be accelerated effectively for better performance. This paper presents a high-performance general framework to accelerate SIAs (FASI). Different from the previous work which accelerate SIAs through enhancing the parallelization only, FASI considers both the memory architectures of hardware platforms and the dataflow of SIAs, and it reschedules the framework of SIAs as a converged dataflow to improve the memory access efficiency. FASI achieves higher acceleration ability by matching the algorithm framework to the hardware architectures. We also design deep optimized structures of the parallelization and convergence of FASI based on the characteristics of specific hardware platforms. We take the quantum behaved particle swarm optimization algorithm (QPSO) as a case to evaluate FASI. The results show that FASI improves the throughput of SIAs and provides better performance through optimizing the hardware implementations. In our experiments, FASI achieves a maximum of 290.7Mbit/s throughput which is higher than several existing systems, and FASI on FPGAs achieves a better speedup than that on GPUs and multi-core CPUs. FASI is up to 123 times and not less than 1.45 times faster in terms of optimization time on Xilinx Kintex Ultrascale xcku040 when compares to Intel Core i7-6700 CPU/ NVIDIA GTX1080 GPU. Finally, we compare the differences of deploying FASI on hardware platforms and provide some guidelines for promoting the acceleration performance according to the hardware architectures.

Original language	English
Pages (from-to)	72327 - 72344
Number of pages	19
Journal	IEEE Access
Volume	6
Early online date	20 Nov 2018
DOIs	https://doi.org/10.1109/ACCESS.2018.2882455
Publication status	Published - 2018

Bibliographical note

This work is supported by the National Natural Science Foundation of China (Grant Nos.61472159, 61572227, 61772227), Development Project of Jilin Province of China (Nos. 20160204022GX, 20170101006JC, 20170203002GX, 2017C030-1, 2017C033, 20180414012GH). This work is also supported in part by Premier-Discipline Enhancement Scheme supported by Zhuhai Government and Premier Key-Discipline Enhancement Scheme supported Guangdong Government Funds, and Jilin Provincial Key Laboratory of Big Data Intelligent Computing (20180622002JC).

Keywords

Field programmable gate arrays
Multicore processing
Parallel programming
Particle swarm optimization
Pipeline processing

Access to Document

10.1109/ACCESS.2018.2882455Licence: Other

A General Framework for Accelerating Swarm Intelligence Algorithms on FPGAs, GPUs and Multi-Core CPUs
(c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information
Final published version, 1.86 MBLicence: Other

Cite this

@article{ee0ef17170844d28a2ddf561c612ffc2,

title = "A General Framework for Accelerating Swarm Intelligence Algorithms on FPGAs, GPUs and Multi-core CPUs",

abstract = "Swarm intelligence algorithms (SIAs) have demonstrated excellent performance when solving optimization problems including many real-world problems. However, because of their expensive computational cost for some complex problems, SIAs need to be accelerated effectively for better performance. This paper presents a high-performance general framework to accelerate SIAs (FASI). Different from the previous work which accelerate SIAs through enhancing the parallelization only, FASI considers both the memory architectures of hardware platforms and the dataflow of SIAs, and it reschedules the framework of SIAs as a converged dataflow to improve the memory access efficiency. FASI achieves higher acceleration ability by matching the algorithm framework to the hardware architectures. We also design deep optimized structures of the parallelization and convergence of FASI based on the characteristics of specific hardware platforms. We take the quantum behaved particle swarm optimization algorithm (QPSO) as a case to evaluate FASI. The results show that FASI improves the throughput of SIAs and provides better performance through optimizing the hardware implementations. In our experiments, FASI achieves a maximum of 290.7Mbit/s throughput which is higher than several existing systems, and FASI on FPGAs achieves a better speedup than that on GPUs and multi-core CPUs. FASI is up to 123 times and not less than 1.45 times faster in terms of optimization time on Xilinx Kintex Ultrascale xcku040 when compares to Intel Core i7-6700 CPU/ NVIDIA GTX1080 GPU. Finally, we compare the differences of deploying FASI on hardware platforms and provide some guidelines for promoting the acceleration performance according to the hardware architectures.",

keywords = "Field programmable gate arrays, Multicore processing, Parallel programming, Particle swarm optimization, Pipeline processing",

author = "Dalin Li and Lan Huang and Kangping Wang and Wei Pang and You Zhou and Rui Zhang",

note = "This work is supported by the National Natural Science Foundation of China (Grant Nos.61472159, 61572227, 61772227), Development Project of Jilin Province of China (Nos. 20160204022GX, 20170101006JC, 20170203002GX, 2017C030-1, 2017C033, 20180414012GH). This work is also supported in part by Premier-Discipline Enhancement Scheme supported by Zhuhai Government and Premier Key-Discipline Enhancement Scheme supported Guangdong Government Funds, and Jilin Provincial Key Laboratory of Big Data Intelligent Computing (20180622002JC).",

year = "2018",

doi = "10.1109/ACCESS.2018.2882455",

language = "English",

volume = "6",

pages = "72327 -- 72344",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "IEEE Explore",

}

TY - JOUR

T1 - A General Framework for Accelerating Swarm Intelligence Algorithms on FPGAs, GPUs and Multi-core CPUs

AU - Li, Dalin

AU - Huang, Lan

AU - Wang, Kangping

AU - Pang, Wei

AU - Zhou, You

AU - Zhang, Rui

N1 - This work is supported by the National Natural Science Foundation of China (Grant Nos.61472159, 61572227, 61772227), Development Project of Jilin Province of China (Nos. 20160204022GX, 20170101006JC, 20170203002GX, 2017C030-1, 2017C033, 20180414012GH). This work is also supported in part by Premier-Discipline Enhancement Scheme supported by Zhuhai Government and Premier Key-Discipline Enhancement Scheme supported Guangdong Government Funds, and Jilin Provincial Key Laboratory of Big Data Intelligent Computing (20180622002JC).

PY - 2018

Y1 - 2018

N2 - Swarm intelligence algorithms (SIAs) have demonstrated excellent performance when solving optimization problems including many real-world problems. However, because of their expensive computational cost for some complex problems, SIAs need to be accelerated effectively for better performance. This paper presents a high-performance general framework to accelerate SIAs (FASI). Different from the previous work which accelerate SIAs through enhancing the parallelization only, FASI considers both the memory architectures of hardware platforms and the dataflow of SIAs, and it reschedules the framework of SIAs as a converged dataflow to improve the memory access efficiency. FASI achieves higher acceleration ability by matching the algorithm framework to the hardware architectures. We also design deep optimized structures of the parallelization and convergence of FASI based on the characteristics of specific hardware platforms. We take the quantum behaved particle swarm optimization algorithm (QPSO) as a case to evaluate FASI. The results show that FASI improves the throughput of SIAs and provides better performance through optimizing the hardware implementations. In our experiments, FASI achieves a maximum of 290.7Mbit/s throughput which is higher than several existing systems, and FASI on FPGAs achieves a better speedup than that on GPUs and multi-core CPUs. FASI is up to 123 times and not less than 1.45 times faster in terms of optimization time on Xilinx Kintex Ultrascale xcku040 when compares to Intel Core i7-6700 CPU/ NVIDIA GTX1080 GPU. Finally, we compare the differences of deploying FASI on hardware platforms and provide some guidelines for promoting the acceleration performance according to the hardware architectures.

AB - Swarm intelligence algorithms (SIAs) have demonstrated excellent performance when solving optimization problems including many real-world problems. However, because of their expensive computational cost for some complex problems, SIAs need to be accelerated effectively for better performance. This paper presents a high-performance general framework to accelerate SIAs (FASI). Different from the previous work which accelerate SIAs through enhancing the parallelization only, FASI considers both the memory architectures of hardware platforms and the dataflow of SIAs, and it reschedules the framework of SIAs as a converged dataflow to improve the memory access efficiency. FASI achieves higher acceleration ability by matching the algorithm framework to the hardware architectures. We also design deep optimized structures of the parallelization and convergence of FASI based on the characteristics of specific hardware platforms. We take the quantum behaved particle swarm optimization algorithm (QPSO) as a case to evaluate FASI. The results show that FASI improves the throughput of SIAs and provides better performance through optimizing the hardware implementations. In our experiments, FASI achieves a maximum of 290.7Mbit/s throughput which is higher than several existing systems, and FASI on FPGAs achieves a better speedup than that on GPUs and multi-core CPUs. FASI is up to 123 times and not less than 1.45 times faster in terms of optimization time on Xilinx Kintex Ultrascale xcku040 when compares to Intel Core i7-6700 CPU/ NVIDIA GTX1080 GPU. Finally, we compare the differences of deploying FASI on hardware platforms and provide some guidelines for promoting the acceleration performance according to the hardware architectures.

KW - Field programmable gate arrays

KW - Multicore processing

KW - Parallel programming

KW - Particle swarm optimization

KW - Pipeline processing

U2 - 10.1109/ACCESS.2018.2882455

DO - 10.1109/ACCESS.2018.2882455

M3 - Article

SN - 2169-3536

VL - 6

SP - 72327

EP - 72344

JO - IEEE Access

JF - IEEE Access

ER -

A General Framework for Accelerating Swarm Intelligence Algorithms on FPGAs, GPUs and Multi-core CPUs

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this