Statistical Analysis with Missing Data (Wiley Series in Probability and Statistics) cover
Statistical Analysis with Missing Data (Wiley Series in Probability and Statistics)
by Roderick J. A. Little, Donald B. Rubin
ISBN: 0471183865
Found in 2 comments on Hacker News
View on Amazon
We may earn a commission from purchases made through links on this page.
Not ready yet? Get weekly book picks.
jmalicki · 2014-12-19 · Original thread
RandomForest and other decision tree methods can actually handle missing data very well by treating an individual cell in the data matrix as missing, rather than discarding the entire row.

So the assertion that "All algorithms should operate only on data vectors and on frequency weights — they should have no knowledge of missing-ness." is false - there are a lot of other fruitful ways to handle missing data, such as using indicator variables, imputation, etc. - see http://www.amazon.com/Statistical-Analysis-Missing-Roderick-...