Found in 3 comments on Hacker News
queensnake · 2019-06-05 · Original thread
I don't think there is one, yet. 'Multiprocessor Programming': https://www.amazon.com/Art-Multiprocessor-Programming-Revise... has some algorithms, though.
nobody271 · 2018-10-02 · Original thread
I learned a little OpenCL a few years ago just because I wanted to see how GPUs were programmed. I tried several books:

- Heterogeneous Computing with OpenCL (http://www.hds.bme.hu/~fhegedus/C++/Heterogeneous_computing_...) - informative but not many examples

- The Art of Multiprocessor Programming (https://www.amazon.com/Art-Multiprocessor-Programming-Revise...) - I had a hard time getting traction with this book.

- OpenCL Parallel Programming Development Cookbook (https://www.amazon.com/dp/B00ESX1AH2/ref=dp-kindle-redirect?...) - Not a great reference but it had some easy to follow examples.

- Actually a few other books you might find when searching for parallel processing or parallel algorithms which just turn out to be entirely abstract math books.

People would ask my why I wanted to learn to program on a GPU and I didn't have an answer. Surely I would find an answer in one of those books. I saved a few of the projects:

- Edge detection (https://mega.nz/#!LJUwmLSa!dRijnB1xVhI9RAC1Xac_xRhT2IsfDG2sJ...) - fun!

- GPU template (https://mega.nz/#!yAsxATzb!Y4-9zRMCTSYHX1pKxWPQPl8WNDgnWkSAU...) - write GPU code with JavaScript

I have another one for bitonic sort somewhere (a parallel sort that sadly isn't even as good as quick sort).

The projects I enjoyed most were image filters (like edge detection). You could do a project that implements various image filters. If you did that you would not only get experience writing CUDA but you would learn how a lot of different filters are done.

rusanu · 2017-09-17 · Original thread
Once I've read an excellent book, The Art of Multiprocessor Programming [0]. All examples in the book are in Java. After reading it, when thinking about some of the solutions and algorithms presented, I quickly concluded that deploying them in C/C++ would be an order of magnitude more complex, because of 'memory reclamation'. So yes, I think the OP makes a good point.

[0] https://www.amazon.com/Art-Multiprocessor-Programming-Revise...

Fresh book recommendations delivered straight to your inbox every Thursday.