I would recommend reading Toby Segaran's Programming Collective Intelligence: http://shop.oreilly.com/product/9780596529321.do
If I'm not mistaken, there were prototypes for doing collaborative filtering of Delicious dataset written by standalone applications. But doing such filtering requires direct access to the database and cannot be done efficiently via HTTP. So, after the current team had bought Delicious from Yahoo, I thought that implementing collaborative filtering would be their main priority. Instead of that, they concentrate forces on worsening and cluttering UI experience along with breaking various Delicious bookmarking plugins for popular browsers.
Comparing the prices of a book on oreilly and bookdepository.co.uk seems to suggest a print book has about 50%(just a guess) margin while the ebooks sells for almost 100% margin once it is produced. So knocking the prices of ebook down to 50% to drive sales would make sense.
http://www.bookdepository.co.uk/Programming-Collective-Intel... I assumed that these guys buy the book for 20 dollars and sell for 20 euros and make 25-30% profit. Great site btw with free worldwide shipping although delivery takes anywhere from 1 to 3 weeks.
It's highly useful knowledge too. I ran across so many startups that needed recommender systems that I launched a company called Algorithmic.ly (http://algorithmic.ly) to help companies without the expertise integrate recommendation systems and other types of algorithms into their projects.
I'm currently reading Python for Data Analysis but feel like I can read about how to use a library but it's hard to retain specific syntax use cases if I'm not using those libraries immediately / frequently.
One book I really like is Collective Intelligence (http://shop.oreilly.com/product/9780596529321.do), which has some good examples on social network analysis.
As for the theory, someone made a nice review of 10 popular ML books here http://zinkov.com/posts/2012-10-04-ml-book-reviews/ and http://www.stat.cmu.edu/~larry/all-of-statistics/index.html is a nice book on inferential statistics.
I'm sure facebook engineers could come up with something way more sophisticated.
I'm trying to learn how to classify items as related within a dataset. I know http://directedge.com does this (funded by yc, run by #wheels) so I had a look at their articles which are a helpful beginners intro. http://directededge.com/tech.html
In one of the articles #wheels recommends http://oreilly.com/catalog/9780596529321
so I'm going to pick that one up but to be clear I really have no idea if this book addresses graph theory specifically; at this point anything and everything is helpful to me.
Since reading it I've noticed how many friends have it on their bookshelves.
Here's a link: http://oreilly.com/catalog/9780596529321
Naive Bayesian classifiers are just one of the more popular types; others include Support Vector Machines (SVMs), decision trees (and their relatives, random forests), and a bunch more. If you'd like to play around with some, Weka is good open source software for this:
Depending on your project scope I really enjoyed 'Programming Collective Intelligence' by Toby Segaran. Some of the web API samples may be out of date at this point but I don't consider that a deal breaker if you can find it used cheap.
If you want a good introduction to Naive Bayesian classifiers, there was a pretty readable explanation in Artificial Intelligence: a Modern Approach. It's an expensive book, but I'm sure you can find a copy in any well-stocked university library.
As all the recommendation algorithms the basis is the text, and dictionaries with technologies, and of course a long time blacklisting words by hand.
We didn't think about ssl but, could be a nice feature. Is already wrote down in our roadmap. I like it.
About http://www.foaf-project.org/ give me some days and I'll put it. I can tell you when this is done through GitHub's message system.
We are really appreciating and considering every ideas all of you are giving here. Thanks a lot.
I downloaded a torrent version, then bought the paperback version straight after.
Programming Collective Intelligence (O'reilly)
Practical Artificial Intelligence Programming in Java
I know you've read some Python books, but you might want to check out Segaran's books too. The examples in Programming Collective Intelligence [http://oreilly.com/catalog/9780596529321] are all Python too I think..
It's way easier to start with it, than to learn statistics via pure theory.
Hot off press , 3 week old
data mining: there's a few good books, look at Amazon reviews (Han/Kamber, Witten/Frank, the one i have:
and a pretty accessible / non-dense intro, Programming collective intelligence
books about machine learning background/theory:
- "machine learning"
- "learning and soft computing"
For a simple-to-use (Python-based) machine learning tool/API check out Orange: http://magix.fri.uni-lj.si/orange/
Fresh book recommendations delivered straight to your inbox every Thursday.