Because there is an enormous amount of genomic data, next-generation sequencing (NGS) applications pose significant challenges to current computing systems. In this study, we investigate both algorithmic and architectural strategies to accelerate an NGS data analysis algorithm–short read mapping on commodity multi-core platform and customizable FPGA (field programmable gate array) co-processor architecture, respectively.
A workload analysis reveals that conventional memory optimization is limited in its irregular computation of low arithmetic intensity and non-contiguous memory access pattern. To mitigate the inherent irregular computation in mapping, we have developed a FPGA co-processor based on Convey computer, which employs a scatter-gather memory mechanism that exploits both bit-level and word-level parallelism. The customized FPGA co-processor achieves a throughput of 947Gbp per day, about 189 times higher than that of current mapping tools on single CPU core. Moreover, the co-processor’s power efficiency is 29 times higher than that of a conventional 64-core multi-processor.