Found in 1 comment on Hacker News
bra-ket · 2020-10-21 · Original thread
dopamine rewards operate on a different time scale vs. that required by these error correction models. I don't remember the exact paper, will need to look it up, but it was orders of magnitude difference in response times.

Edit: for authoritative reference on biologically-plausible learning see anything by Edmund Rolls [1]. He explicitly stated in his recent book [2] that something like back-propagation, or similar error correction mechanisms have no supporting evidence in experimental data collected so far

[1] https://www.oxcns.org/profile.html

[2] https://www.amazon.com/Cerebral-Cortex-Principles-Edmund-Rol...