Solutions For Optimizing The Data Parallel Prefix Sum Algorithm Using The Compute Unified Device Architecture
In this paper, we analyze solutions for optimizing the data parallel prefix sum function using the Compute Unified Device Architecture (CUDA) that provides a viable solution for accelerating a broad class of applications. The parallel prefix sum function is an essential building block for many data mining algorithms, and therefore its optimization facilitates the whole data mining process. Finally, we benchmark and evaluate the performance of the optimized parallel prefix sum building block in CUDA.
Volume (Year): 5 (2011)
Issue (Month): 2.1 (December)
|Contact details of provider:|| Postal: |
Web page: http://www.rau.ro/
More information through EDIRC
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Alexandru PIRJAN, 2010. "Improving Software Performance in the Compute Unified Device Architecture," Informatica Economica, Academy of Economic Studies - Bucharest, Romania, vol. 14(4), pages 30-47.
When requesting a correction, please mention this item's handle: RePEc:rau:journl:v:5:y:2011:i:2.1:p:465-477. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Alex Tabusca)
If references are entirely missing, you can add them using this form.