For general machine learning, there are many, many books. A good intro is  and a more comprehensive, reference sort of book is . Frankly, by this point, even reading the documentation and user guide of scikit-learn has a fairly good mathematical presentation of many algorithms. Another good reference book is .
Finally, I would also recommend supplementing some of that stuff with Bayesian analysis, which can address many of the same problems, or be intermixed with machine learning algorithms, but which is important for a lot of other reasons too (MCMC sampling, hierarchical regression, small data problems). For that I would recommend  and .
Stay away from bootcamps or books or lectures that seem overly branded with “data science.” This usually means more focus on data pipeline tooling, data cleaning, shallow details about a specific software package, and side tasks like wrapping something in a webservice.
That stuff is extremely easy to learn on the job and usually needs to be tailored differently for every different project or employer, so it’s a relative waste of time unless it is the only way you can get a job.
: < https://www.amazon.com/Deep-Learning-Adaptive-Computation-Ma... >
: < https://www.amazon.com/Pattern-Classification-Pt-1-Richard-D... >
: < https://www.amazon.com/Pattern-Recognition-Learning-Informat... >
: < http://www.web.stanford.edu/~hastie/ElemStatLearn/ >
: < http://www.stat.columbia.edu/~gelman/book/ >
: < http://www.stat.columbia.edu/~gelman/arm/ >
Machine Learning: a Probabilistic Perspective, by Murphy
Pattern classification, by Duda et all
The Elements of Statistical Learning, by Hastie et all. It is free from Stanford.
Mining of Massive Datasets, free from Stanford.
Bayesian Reasoning and Machine Learning, by Barber, free available online.
Learning from data, by Abu-Mostafa.
It comes with Caltech video lectures: http://work.caltech.edu/telecourse.html
Pattern Recognition and Machine Learning, by Bischop
Information Theory, Inference, and Learning Algorithms, by Mackay, free.
Classification, Parameter Estimation and State Estimation, by van der Heijden.
Computer Vision: Models, Learning, and Inference, by Prince, available for free
Probabilistic Graphical Models, by Koller. Has an accompanying course on Coursera.
It is very pragmatic, including algorithms for many machine learning and artificial intelligence topics (from fitting functions for classification or regression purposes to search processes). The authors have a strong industrial background (in addition to the academic).
Fresh book recommendations delivered straight to your inbox every Thursday.