Reliable Machine Learning cover
Reliable Machine Learning
by Cathy Chen, Niall Murphy, Kranti Parisa, D. Sculley, Todd Underwood
Description: Reliable Machine Learning provides practical guidance for implementing and maintaining machine learning systems across various organizational roles. It covers topics such as model monitoring, team management, and ensuring effective and accountable ML practices
ISBN: 9781098106218
Found in 1 comment on Hacker News
We may earn a commission from purchases made through links on this page.
Not ready yet? Get weekly book picks.
hereonout2 · 2023-12-14 · Original thread
Things like this give a good overview of the problems being face in productionising ML:

https://research.google/pubs/whats-your-ml-test-score-a-rubr...

Note they start to discuss things like unit testing, integration testing, processing pipelines, canary tests, rollbacks, etc. Sound familiar yet?

The same author has also written this book:

https://www.oreilly.com/library/view/reliable-machine-learni...

I don't see a software engineer's skills becoming redundant in this field, especially if you have a good level of experience in cloud infra and tooling. It seems more valuable that ever to me (e.g. I have worked with ML Researchers who don't grasp HTTP let alone could set up a fleet of severs to run their model developed entirely in Jupyter Notebook).

I have found it helpful to equate myself with the correct tools and terminology in order to speak the right language - there's specific tools lots of people use such as Weights & Biases for "Experiment Tracking", terms like "Model Repository" which is just what it sounds like. "Vector Databases" (Elastic Search had this feature for years), "Feature Stores" - feel familiar to big table type databases.

Reading up on a typical use case like "RAG - Retrieval Augmented Generation" is a good idea - alongside starting to think about how you'd actually build and deploy one.

Above all having a decent background in cloud infra, engineering and how to optimise systems and code for production deployment at scale is a very in demand at the moment.

Being the person helping these teams of PHDs (many of whom have little industry experience) to productionise and deploy is where I am at right now - it feels like a fruitful place to be :)