- Heterogeneous Computing with OpenCL (http://www.hds.bme.hu/~fhegedus/C++/Heterogeneous_computing_...) - informative but not many examples
- The Art of Multiprocessor Programming (https://www.amazon.com/Art-Multiprocessor-Programming-Revise...) - I had a hard time getting traction with this book.
- OpenCL Parallel Programming Development Cookbook (https://www.amazon.com/dp/B00ESX1AH2/ref=dp-kindle-redirect?...) - Not a great reference but it had some easy to follow examples.
- Actually a few other books you might find when searching for parallel processing or parallel algorithms which just turn out to be entirely abstract math books.
People would ask my why I wanted to learn to program on a GPU and I didn't have an answer. Surely I would find an answer in one of those books. I saved a few of the projects:
- Edge detection (https://mega.nz/#!LJUwmLSa!dRijnB1xVhI9RAC1Xac_xRhT2IsfDG2sJ...) - fun!
- GPU template (https://mega.nz/#!yAsxATzb!Y4-9zRMCTSYHX1pKxWPQPl8WNDgnWkSAU...) - write GPU code with JavaScript
I have another one for bitonic sort somewhere (a parallel sort that sadly isn't even as good as quick sort).
The projects I enjoyed most were image filters (like edge detection). You could do a project that implements various image filters. If you did that you would not only get experience writing CUDA but you would learn how a lot of different filters are done.
[0] https://www.amazon.com/Art-Multiprocessor-Programming-Revise...