- Portals
- The Current Year
- ED in the News
- Admins
- Help ED Rebuild
- Archive
- ED Bookmarklet
- Donate Bitcoin
Contact an admin on Discord or EDF if you want an account. Also fuck bots.
User:DNJACK/OpenCL: Difference between revisions
imported>DNJACK No edit summary |
imported>DNJACK No edit summary |
(3 intermediate revisions by the same user not shown) | |
(No difference)
|
Latest revision as of 03:55, 24 January 2015
So you've written your software in C but it's still too slow. Your next optimization step is to include SIMD instructions but you are to lazy to learn to do it yourself, or you are looking for an excuse to blow several grands on a graphic card and you hate Nvidia? OpenCL might be right for you !
How it works
While Nvidia's CUDA is made to be used by computer illiterate scientists, OpenCL forces you to tell it exactly what you want, so you'll need this. Both CUDA and OpenCL are lower-level than C, so you better know what kind of hardware you expect it to run on or check it at runtime and write the code for the different hardware.
- Environment variables are initiated. - This is were the program learn what device will execute the code, like which of your numerous GPU it will use if you are overcompensating for your lack of dick, etc...
- Memory initialization. - If you want to use the GPU, this is where you reserve memory on it. Special allocations also exists for CPU. Can also be done later.
- Reading and building the program. - The code have to be built at runtime to allow prior code to determine where the kernels will execute.
- Extracting the kernels. - Getting the different kernels functions from the compiled program.
- Write data to GPU memory, if program for GPU.
- Use them! aka enqueue a kernel in a command queue - You didn't do all that for nothing, didn't you.
- Read data from GPU memory if program for GPU, or unmap for CPU.
- Release Kernels, memory, and environment variables.
Best Practices
- Limit data transfers. They take important computer time.
- 256 work-item per GPU work-group will give good results most of the time.
- Error-check continuously, or you'll only get them when freeing memory.
- Make sure to create a fuckloads of threads. It is normal to sometime have more than one thread per data.
See also
External Links
DNJACK/OpenCL is part of a series on Visit the Softwarez Portal for complete coverage. |
DNJACK/OpenCL is part of a series on Programming. [Enter the Matrix] | |
ADA • Assembly • C • C++ • COBOL • Debug • DOS • Erlang • Error • Fdisk • Fortran • Integer • Java • LOLCode • Machine Code • Matlab • MIRC Script • MUMPS • Open Source • Perl • PHP • Programming language • Python • QBASIC • Ruby on Rails • Scratch • SSH • Visual Basic
Firefox XPS IRC Attack • Safari XPS Attack • Sandworm
Bill Gates • Linus Torvalds • Weev • Goatse Security • Terry Davis • Theo de Raadt
Operating system • Warez • Notepad • Is not a bug, it's a feature • Database Error |