Manning: some books about search technologies
Disclaimer!
None of the following links is an affiliate, and I have never personally linked to the Manning publishing.
Algorithms of the Intelligent Web
Topics of the book - search, data mining, classification, clasterisation, personal recommendations, etc.
The emphasis is on general principles and algorithms to organize the process.
The book has not yet been published (publication scheduled for March 2009), but is available for purchase through MEAP (Manning Early Access Program). Thus, what I took an advantage on. I bought it for chapters, starting with the 3rd, but decided to read from the beginning.
Collective Intelligence in Action
Very close on the first, but more attention paid to tools - Lucene, Nutch, WEKA
The press must be 17 October. Available as the most Manning-books are, in PDF format. Could not get to decide which of the two to choose, but now I tend to think that I’ll buy this one, too.
Taming Text
Again, very close to the first two, but is more specific specialization. The theme of the book - “how to cope with the unstructured text.” So far, says only half of the book is available through the MEAP.
Hibernate Search in Action
All the same search, but now in the annex to the specific technologies - Hibernate Core + Apache Lucene
Lucene in Action, Second Edition
Reissue of the famous book on the famous search framework - Lucene. Lucene - in the original Java-framework, has been ported to other languages and platforms. Lucene is the basis for other powerful and interesting projects - Hadoop, Solr and others.
Conclusion (rather passing observation):
It seems, Java finally ceased to be perceived as a “brake” even in such sensitive to the performance areas as search and processing large amounts of data. Or, in recent years has grown a generation of programmers / authors of books that are completely forgotten the C / C + +?
look good