Found in 36 comments on Hacker News
bjourne · 2017-08-28 · Original thread
It depends on what "pursuing ML/AI" means. I've written a recommendation engine with barely understanding linear algebra and a spam filter without knowing Bayes theorem. A programmer can work on ML systems without having a solid foundation in higher maths. However, if you want to develop your own solutions then you surely need the math.

I would recommend reading Toby Segaran's Programming Collective Intelligence:

jchung · 2015-01-08 · Original thread
For those interested in implementing other similar systems, the O'Reilly book on Collective Intelligence is exceptional:
mck- · 2014-04-09 · Original thread
Exactly right. I borrowed that term from Chapter 2 on Collective Intelligence [1]


ville · 2013-12-08 · Original thread
This looks nice. I've also heard many recommendations for the book Programming Collective Intelligence[1], which touches the same subjects and also has examples in Python. Now I'm tempted to read both :)


kozlovsky · 2013-10-16 · Original thread
I cannot understand why Delicious team doesn't implemented collaborative filtering [1]. I think this is the main benefit that the user can get from a social bookmarking site - the ability to see the "bookmarks of others that are similar to your bookmarks". The implementation is not too complex and nicely covered in the book [2] with examples in Python. The book even has a special section "Building a Link Recommender".

If I'm not mistaken, there were prototypes for doing collaborative filtering of Delicious dataset written by standalone applications. But doing such filtering requires direct access to the database and cannot be done efficiently via HTTP. So, after the current team had bought Delicious from Yahoo, I thought that implementing collaborative filtering would be their main priority. Instead of that, they concentrate forces on worsening and cluttering UI experience along with breaking various Delicious bookmarking plugins for popular browsers.

[1] [2]

kops · 2013-08-30 · Original thread
Thanks, that seems to be the only reasonable explanation.

Comparing the prices of a book on oreilly[1] and[2] seems to suggest a print book has about 50%(just a guess) margin while the ebooks sells for almost 100% margin once it is produced. So knocking the prices of ebook down to 50% to drive sales would make sense. Thanks again,


[2] I assumed that these guys buy the book for 20 dollars and sell for 20 euros and make 25-30% profit. Great site btw with free worldwide shipping although delivery takes anywhere from 1 to 3 weeks.

ghc · 2013-08-05 · Original thread
I can't judge the coursera course, but for anyone who is interested in this field and wants a gentle introduction, I high recommend Programming Collective Intelligence ( It covers many of the types of recommender systems that the coursera course is likely to cover, and comes with a lot of nice Python code examples.

It's highly useful knowledge too. I ran across so many startups that needed recommender systems that I launched a company called ( to help companies without the expertise integrate recommendation systems and other types of algorithms into their projects.

Thanks @poof131 - I'd like to go deeper into algorithms / data manipulation / social network analysis (for my job), and also web programming using python (weekend reading).

I'm currently reading Python for Data Analysis but feel like I can read about how to use a library but it's hard to retain specific syntax use cases if I'm not using those libraries immediately / frequently.

One book I really like is Collective Intelligence (, which has some good examples on social network analysis.

jclos · 2012-10-26 · Original thread
I would advise you to get started with these books, which are practice-oriented (rather than theory-oriented such as the Mitchell book): with the Weka toolkit and/or with the R language and/or or with Python

As for the theory, someone made a nice review of 10 popular ML books here and is a nice book on inferential statistics.

zzzeek · 2012-07-30 · Original thread
this is a great use case for bayesean filtering, and I bet you could even hook up an existing bayes engine to this problem space, if not just write one up using common recipes - I've done great things with the one described in "Programming Collective Intelligence" ( Things like "googlebot", "certain IP range", "was observed clicking 100 links in < 1 minute", and "didn't have a referrer" all go into the hopper of observable "features", which are all aggregated to produce a corpus of bot/non bot server hits. run it over a gig of log files to train it and you'll have a decent system which you can tune to favor false negatives vs. false positives.

I'm sure facebook engineers could come up with something way more sophisticated.

eloisius · 2012-05-04 · Original thread
I'm definitely grabbing Programming Collective Intelligence[1] and Machine Learning for Hackers[2]. Any recommendations based on those?



nazar · 2011-10-14 · Original thread
I am super surprised by your comment, because thats exactly the question I asked my friend yesterday. I am trying to implement a ranking system on my website, so I asked my more experienced friend about it and he recommended me one book: Programming Collective Intelligence Building Smart Web 2.0 Applications by Toby Segaran, O'Reilly

seancron · 2011-09-25 · Original thread
I'd also recommend Programming Collective Intelligence ( which is filled with lots of python code showing how different algorithms and techniques can be used.
ghc · 2011-09-22 · Original thread
I believe that people who are interested in the subject matter, rather than the Neo4j & Gremlin implementation, would be better served reading the excellent book: Programming Collective Intelligence ( ).
apsurd · 2011-08-26 · Original thread
Disclaimer: I'm a total n00b on this topic.

I'm trying to learn how to classify items as related within a dataset. I know does this (funded by yc, run by #wheels) so I had a look at their articles which are a helpful beginners intro.

In one of the articles #wheels recommends

so I'm going to pick that one up but to be clear I really have no idea if this book addresses graph theory specifically; at this point anything and everything is helpful to me.

natfriedman · 2011-05-05 · Original thread
Another option is "Programming Collective Intelligence," by Toby Segaran. I read through it recently on a long flight to Australia. It's one of the most straight-forward AI books out there, presenting most of these algorithms in just a few pages with nice sample Python code and diagrams. A perfect intro/refresher, and it takes a web developer perspective on these techniques.

Since reading it I've noticed how many friends have it on their bookshelves.

Here's a link:

lookforipv6 · 2011-04-17 · Original thread
In "Programming Collective Intelligence" there are other examples if someone is interested.

user24 · 2011-01-10 · Original thread
This, I think, is the source code which accompanies this book:
gregschlom · 2010-12-03 · Original thread
There's an excellent chapter on automatic news article categorization in "Programming Collective Intelligence"

pjscott · 2010-11-29 · Original thread
If you're interested, there are a whole host of fun and useful machine learning techniques that are actually not as hard to understand and apply as they sound. The best introductory book that I know of is Programming Collective Intelligence, which is surprisingly clear, if a little vague on the theory:

Naive Bayesian classifiers are just one of the more popular types; others include Support Vector Machines (SVMs), decision trees (and their relatives, random forests), and a bunch more. If you'd like to play around with some, Weka is good open source software for this:

metamemetics · 2010-09-29 · Original thread
Not free, but I highly recommend "Programming Collective Intelligence" if you are looking for python examples applicable to web applications.
michaelhans · 2010-07-07 · Original thread

Depending on your project scope I really enjoyed 'Programming Collective Intelligence' by Toby Segaran. Some of the web API samples may be out of date at this point but I don't consider that a deal breaker if you can find it used cheap.

sketerpot · 2010-05-22 · Original thread
A good book for this sort of thing is Toby Segaran's excellent Programming Collective Intelligence. It walks you through this sort of fascinating thing with easy examples and clear explanations. It's sprinkled with simple Python code.

If you want a good introduction to Naive Bayesian classifiers, there was a pretty readable explanation in Artificial Intelligence: a Modern Approach. It's an expensive book, but I'm sure you can find a copy in any well-stocked university library.

masterbranch · 2010-05-19 · Original thread
Hi, For the algorithms our inspirational reading was "Programming Collective Intelligence"

As all the recommendation algorithms the basis is the text, and dictionaries with technologies, and of course a long time blacklisting words by hand.

We didn't think about ssl but, could be a nice feature. Is already wrote down in our roadmap. I like it.

About give me some days and I'll put it. I can tell you when this is done through GitHub's message system.

We are really appreciating and considering every ideas all of you are giving here. Thanks a lot.

gregschlom · 2010-05-04 · Original thread
I must cite Programming Collective Intelligence from Toby Segaram ( Altough not entirely focused on search engines, it's an awesome book for anyone who wants to get their hands on some of the most useful algorithms for web apps, without having to deal with the math.

I downloaded a torrent version, then bought the paperback version straight after.

I would add:

Programming Collective Intelligence (O'reilly)

Practical Artificial Intelligence Programming in Java

kakooljay · 2009-10-06 · Original thread
There's tons of Python examples on Toby Segaran's blog:

I know you've read some Python books, but you might want to check out Segaran's books too. The examples in Programming Collective Intelligence [] are all Python too I think..

mrduncan · 2009-04-18 · Original thread
My vote goes to Programming Collective Intelligence (
jlees · 2009-04-02 · Original thread
FYI, if you liked this talk you might also like 'Programming Collective Intelligence'. If you haven't read it already...

amix · 2009-02-02 · Original thread
I would recommend reading Programming Collective Intelligence ( It features lots of coding examples and covers lots of topics (like recommendation systems, searching and ranking, document filtering, document grouping etc.)

It's way easier to start with it, than to learn statistics via pure theory.

Bjoern · 2009-01-06 · Original thread
The book he mentioned. Its worth a read if you are looking for a practical introduction of Machine Learning/Classification etc. (currently on my desk)

gtani · 2008-06-12 · Original thread
if you get the math in Norvig/Russell's AI text, and the 2 leading Natural Language processing texts (Jurafsky/Martin and Manning/Schuetze) you'll be golden.

Hot off press , 3 week old

even hotter

data mining: there's a few good books, look at Amazon reviews (Han/Kamber, Witten/Frank, the one i have:

and a pretty accessible / non-dense intro, Programming collective intelligence

at · 2007-11-13 · Original thread
books about machine learning applications/tools: - "Programming Collective Intelligence" - - "Data Mining: Practical Machine Learning Tools and Technique" -

books about machine learning background/theory: - "machine learning" - - "learning and soft computing" -

For a simple-to-use (Python-based) machine learning tool/API check out Orange:

Fresh book recommendations delivered straight to your inbox every Thursday.