For modeling I found Wooldridge's panel and cross-section data book very useful: https://www.amazon.com/Econometric-Analysis-Cross-Section-Pa...
Greene is a really useful reference text: https://www.amazon.com/Econometric-Analysis-8th-William-Gree...
For advanced stats theory, I recommend Casella and Berger https://www.amazon.com/Statistical-Inference-George-Casella/...
Hope that helps!
The more specific a model can be made to the problem at hand, the better it'll perform. Supervised ML models are great starting / baseline models.
* Statistical Inference by Casella and Berger. This book has a very good reputation for building statistics from first principles. I won't link to them, but you can find full PDF scans online with a simple search. Amazon reviews: https://www.amazon.com/Statistical-Inference-Roger-Berger/dp...
* Statistics by Freedman, Pisani, and Purves has similarly very good reviews and can be easily found online. Amazon reviews: https://www.amazon.com/Statistics-Fourth-David-Freedman-eboo...
* The majority of the Berkeley data science core curriculum books are online. This is not purely statistics but 1) is taught in a modern style that makes use of computation and randomization and 2) uses tools that may be useful to learn about.
1. https://inferentialthinking.com/chapters/intro.html (Data 8)
2. https://learningds.org/intro.html (Data 100)
3. http://prob140.org/textbook/content/README.html (Data 140)
4. https://data102.org/fa23/resources/#textbooks-from-previous-... (Data 102; this gets into machine learning and pure statistics)
The Berkeley curriculum is not the only one; there are tens, possibly hundreds, of online courses. The Berkeley curriculum is just 1) quite extensive and 2) the one I happened to read the most about when I was recently researching how data science is currently taught.