Norvig expanded on this theme in a chapter in http://oreilly.com/catalog/9780596157111/ (not yet out) -- the draft I read applied the Google n-gram corpus to word segmentation, decryption, and a faster spelling corrector. Lovely and instructive code, as always.
often to get the right visual it is important to first realize the kind of data available---this book drives that home.