Found in 3 comments on Hacker News
mindcrime · 2021-12-02 · Original thread
I don't even know if anybody has written a book specifically about search at "web scale" (no MongoDB jokes here, please). But about the closest things I know of would be something like:

https://www.amazon.com/Managing-Gigabytes-Compressing-Multim...

https://www.amazon.com/Information-Retrieval-Implementing-Ev...

https://www.amazon.com/Introduction-Information-Retrieval-Ch...

tedmiston · 2016-09-02 · Original thread
^ Great answer. So far, this is the only correct one in the thread.

I took Information Retrieval 101 in grad school and it was an interesting course. If you're curious to learn more, term frequency–inverse document frequency (tf–idf) is a good place to start. The underlying idea is surprisingly simple.

https://en.wikipedia.org/wiki/Tf–idf

Likewise with the core of Google's (original) ranking algorithm, PageRank, which is inspired by ideas like h-index.

https://en.wikipedia.org/wiki/PageRank

Also, the "standard" book which we used is quite readable: Introduction to Information Retrieval by Manning, et al.

https://www.amazon.com/Introduction-Information-Retrieval-Ch...

ahi · 2011-07-09 · Original thread
I heartily recommend "Introduction to Information Retrieval": http://www.amazon.com/Introduction-Information-Retrieval-Chr...

Skim it once to collect vocabulary, then use it as a reference for IR algorithms.

Fresh book recommendations delivered straight to your inbox every Thursday.