-  Pattern Recognition and Machine Learning (Information
Science and Statistics)
-  The Elements of Statistical Learning
-  Reinforcement Learning: An Introduction by Barto and Sutton
-  The Deep Learning by Aaron Courville, Ian Goodfellow, and Yoshua Bengio
-  Neural Network Methods for Natural Language Processing (Synthesis Lectures on Human Language Technologies) by Yoav Goldberg
Then some math tid-bits:
 Introduction to Linear Algebra by Strang
-  [PDF](http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%...)
-  [amz](https://www.amazon.com/Reinforcement-Learning-Introduction-A...)
-  [site](https://www.deeplearningbook.org/)
-  [amz](https://www.amazon.com/Deep-Learning-Adaptive-Computation-Ma...)
-  [pdf](http://incompleteideas.net/book/bookdraft2017nov5.pdf)
-  [amz](https://www.amazon.com/Language-Processing-Synthesis-Lecture...)
-  [amz](https://www.amazon.com/Introduction-Linear-Algebra-Gilbert-S...)
For general machine learning, there are many, many books. A good intro is  and a more comprehensive, reference sort of book is . Frankly, by this point, even reading the documentation and user guide of scikit-learn has a fairly good mathematical presentation of many algorithms. Another good reference book is .
Finally, I would also recommend supplementing some of that stuff with Bayesian analysis, which can address many of the same problems, or be intermixed with machine learning algorithms, but which is important for a lot of other reasons too (MCMC sampling, hierarchical regression, small data problems). For that I would recommend  and .
Stay away from bootcamps or books or lectures that seem overly branded with “data science.” This usually means more focus on data pipeline tooling, data cleaning, shallow details about a specific software package, and side tasks like wrapping something in a webservice.
That stuff is extremely easy to learn on the job and usually needs to be tailored differently for every different project or employer, so it’s a relative waste of time unless it is the only way you can get a job.
: < https://www.amazon.com/Deep-Learning-Adaptive-Computation-Ma... >
: < https://www.amazon.com/Pattern-Classification-Pt-1-Richard-D... >
: < https://www.amazon.com/Pattern-Recognition-Learning-Informat... >
: < http://www.web.stanford.edu/~hastie/ElemStatLearn/ >
: < http://www.stat.columbia.edu/~gelman/book/ >
: < http://www.stat.columbia.edu/~gelman/arm/ >