Let's talk about search technology .I will be continuing about search related blog post over a time.Please visit frequently for update.Full Text Search is a big factor for any successful web site or an application. It's all about speed,accuracy and context sensitive.Building a proper search architecture is a modern art.That's the Google does perfectly.I am not going to talk a another web search engine or not going to talk how Google does search or not going to design new search algorithm.It's all about search technology behind application(Web application,desktop application which needs search functionality.). Let's jump to the point.The Basic idea is,You need to search data from data store. It may be from relational database,Graph database ,file storage,distributed file system,SAN store,cluster anywhere. We already did this in several way but we are not always happy with speed,accuracy,context in terms of return search result.We always try to improve these area.Here I am going to share some recent trend about using such kind of practice you may heard about
Apache Lucene, xapian search server ,mysql full text search or
list of open source in java .I would be talking about Apache Lucene as I am interested in Java and it is also proven to be very fast and full featured search engine.
"Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform." - From Apache Lucene webSite.
There are a lot of websites who adopted lucene as their primary search library.Linkedln.com and twitter.com are among of them. So what makes lucene be a popular search library?. I tried to find answer. It is
- Advanced Full-Text Search Capabilities
- Standards Based Open Interfaces - XML,JSON and HTTP
- Extensible Plugin Architecture
- Faceted Search and Filtering
- Advanced, Configurable Text Analysis
- Rich Document Parsing and Indexing (PDF, Word, HTML, etc) using Apache Tika
lot many.............
Ok from next post I will be talking much on Apache Lucene with some example.