Just another programmer: Let's parallelize everything

Who does not want to see his programs run faster? Using exactly the same machine, same cpu, same gpu, same programming language? Many parallel programming frameworks/APIs have been developed and released recently, and with the help of many amazing programmers worldwide, these APIs have also been wrapped beautifully into many front-end languages such as Python, Ruby, or even Javascript. Some also are under development to be ported to more delicate programming languages such as Haskell, Scala, or Clojure. Soo for those of you who haven't seen/experienced the elegance of parallel computing, what are you waiting for? Let's parallelize everything!

Let me share here some of the parallel programming frameworks I know, and I believe they are famous enough to not get disappointed when using it.

First of all, there is CUDA, developed by NVIDIA, a leading company in the Graphics Processing Unit devices. I have been developing some programs using CUDA for 2 years now, and the API itself is pretty simple and straightforward to use, and I am amazed with the number of sample programs they provided inside the CUDA Toolkit, makes us incredibly easy to learn some stuffs, not only how to use CUDA API but also some important algorithms and tuning-up for developing a parallel executed program. Having said so, actually it was pretty hard to develop using CUDA back then (when it was still version 1.x), but NVIDIA just released a new version of CUDA (CUA 4.2 going to CUDA 5) and everything just became easier, to understand, to install, and to develop with that. Most important thing, note that since NVIDIA is the developer and it is not an open source software, CUDA is only available for NVIDIA's GPUs (from GeForce 8800 to the newest one). The technique of having your computation that is supposed to run sequentially on CPU is basically called a GPGPU (General Purposed GPU) computing.

And then there is OpenCL, derived from Open Computing Language, initially developed Apple and Khronos Group. OpenCL is a framework for parallelization and is aimed to be able to execute in many platforms. At first OpenCL was only released with the standard C99 API but then they added the C++ wrappers to the runtime API, hence makes us easier to do some OOP stuffs. OpenCL can run on some major vendors' devices: AMD, Intel, and NVIDIA, where each vendor has its own compiler (or library) to interpret and/or optimize the standardized OpenCL runtime API. afaik it is Khronos who's been leading the OpenCL development and standardizing the API. Since every vendor has different technology equipped to its device, each of them releases its own OpenCL programming SDK and this can be seen inside the website respectively: AMD OpenCL SDK, Intel OpenCL SDK, and NVIDIA OpenCL SDK which comes with the CUDA Toolkit. Each SDK is provided with a unique library and some sample programs, and each vendor's compiler has its own way to optimize in compile time. IMO, learning OpenCL will not be that hard if you have previously done some CUDA programming.

Those two above are the APIs I've been using for a while to do some parallel stuffs, and there are plenty more out there and you can get some of them for free (or maybe even already installed in your computer) but some aren't.
Examples for free APIs: OpenMP, Intel's TBB, Intel's ArBB, Pthreads
Not free: PGI's Compiler with OpenACC, CAPS' Compiler with OpenACC

Just another programmer

Wednesday, July 18, 2012

Let's parallelize everything

2 comments:

TopCoder » Top News