https://www.amazon.com/Handbook-Artificial-Intelligence-Avro...
https://mitpress.mit.edu/books/parallel-distributed-processi...
https://www.amazon.com/Neural-Massively-Parallel-Computers-B...
https://www.amazon.com/Connection-Machine-Press-Artificial-I...
https://www.amazon.com/Legged-Robots-Balance-Artificial-Inte...
https://www.amazon.com/Machines-That-Walk-Adaptive-Suspensio...
...and more.
Here's a great video describing the architecture of the CM-5
Note how similar the programming concepts are to CUDA (at an abstract level). Hillis also in the 80s published his MIT thesis as a book: The Connection Machine
https://www.amazon.com/Connection-Machine-Press-Artificial-I...
An incredibly well written and fascinating read, just as relevant today for programming a GPU as it was for programming the ancient beast of a CM-2. It's about algorithms, graphs, map/reduce, and other techniques of parallelism pioneered at Thinking Machines.
For example, Guy Blelloch worked at TM, and pioneered prefix scans on these machines, now common techniques used on GPUs.
https://www.youtube.com/watch?v=_5sM-4ODXaA
http://uenics.evansville.edu/~mr56/ece757/DataParallelAlgori...
There's also been a lot of hum lately on HN about APL, much of Hillis' *Lisp ideas come from parallelizing array processing primitives ("zectors" and "zappings"), ideas that originating in APL as he acknowledged in the paper describing the language:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108...
What's old is new... again.
Years ago I attempted to take a stab at implementing this language on CUDA I called hillisp [3] but at the time Dynamic Parallelism didn't exist yet and my available hardware was pretty weak, but it was fun to learn more about CUDA which was relatively early technology at the time.
[1] https://www.youtube.com/watch?v=Ua-swPZTeX4
[2] https://www.amazon.com/Connection-Machine-Press-Artificial-I...
[3] https://github.com/michelp/hillisp