One path we recommend java developers take who are new to deep learning is to take the fast.ai class: http://course.fast.ai
From there, map what you learn to our model import in keras: https://deeplearning4j.org/model-import-keras
That will more or less get you up and running.
We also have my oreilly book out for early release: http://shop.oreilly.com/product/0636920035343.do
I used to teach at a data science bootcamp where many of the students got hired by big companies.
I've also been running a deep learning startup for the last few years and have hired quite a few people.
Many of our team don't have phds but can still write backprop code for even complex modules like inception among other things. A lot of my students didn't have phds either.
A few of us (me included) are self taught. I've also coauthored the largest oreilly book on deep learning: http://shop.oreilly.com/product/0636920035343.do
1 piece of advice I would offer is building something that differentiates you from the rest. Many of these "medium thought pieces" you're talking about are actually very cool applications of deep learning. If you want to get hired for these kinds of roles, I would demonstrate you understand how to build things with deep learning. The litmus test I would also look for is "I trained a net from scratch and innovated in x way". Honestly, there's a rare amount of talent out there that can do well at software engineering as well as deep learning. I'm not convinced a phd is a hard requirement.
I get that recruiters at these larger companies definitely tend to look for the buzz words and often can't tell the difference so it's definitely harder going the traditional route.
Tech hiring also tends to be a networking thing as much as it is buzz word bingo no matter what field you're in. If you can network a bit and build something cool that demonstrates an understanding of deep learning I don't see the problem.
https://www.youtube.com/results?search_query=adam+gibson+dee...
http://www.slideshare.net/agibsonccc
http://shop.oreilly.com/product/0636920035343.do
I frequent the big data circles quite a bit. This is our main audience though, not the DL research folks.
As far as my customers go that's actually enough. You're right it's still hard though. I've done my fair share of outreach and speaking though. Anyone who does their research will fine ample credit that we aren't just random folks off the street.
We built up that credibility over time though. I'm still the creator of the dl4j framework itself. So in practice people see we can build software.
In machine learning in production there are 2 phases: training and inference (usage)
In training we have spark docker images where you can run cuda right from spark submit.
In inference mode we sit on top of DC/OS by mesosphere embedding lightbend's (they created scala) micrsoservices technology conductr to scale out automatically on a mesos based cluster: http://www.slideshare.net/agibsonccc/deep-learning-in-produc...
Here is more on our enterprise distribution SKIL: http://www.slideshare.net/agibsonccc/skil-dl4j-in-the-wild-m...
If you're curious where the talent is, I cowrote the flagship oreilly book on deeplearning: http://shop.oreilly.com/product/0636920035343.do
We also employ deep learning phds doing everything from deep learning research in health care, ex nvidia, ex cloudera among others.
While you are right that some feature engineering is needed, there's no reason DL can't be a part of your workflow.
https://www.slideshare.net/agibsonccc/anomaly-detection-and-...
https://www.slideshare.net/pacoid/humanintheloop-a-design-pa...
For more of the basics, my book on deep learning might help as well (minimal math vs the standard text book):
http://shop.oreilly.com/product/0636920035343.do