Sunday, January 22, 2017

Lucene - Overview

Lucene is simple yet powerful java based search library. It can be used in any application to add search capability to it. Lucene is open-source project. It is scalable and high-performance library used to index and search virtually any kind of text. Lucene library provides the core operations which are required by any search application. Indexing and Searching.

How Search Application works?

Any search application does the few or all of the following operations.
StepTitleDescription
1Acquire Raw ContentFirst step of any search application is to collect the target contents on which search are to be conducted.
2Build the documentNext step is to build the document(s) from the raw contents which search application can understands and interpret easily.
3Analyze the documentBefore indexing process to start, the document is to be analyzed as which part of the text is a candidate to be indexed. This process is called analyzing the document.
4Indexing the documentOnce documents are built and analyzed, next step is to index them so that this document can be retrived based on certain keys instead of whole contents of the document. Indexing process is similar to indexes in the end of a book where common words are shown with their page numbers so that these words can be tracked quickly instead of searching the complete book.
5User Interface for SearchOnce a database of indexes is ready then application can make any search. To facilitate user to make a search, application must provide a user a mean or user interface where a user can enter text and start the search process.
6Build QueryOnce user made a request to search a text, application should prepare a Query object using that text which can be used to inquire index database to get the relevant details.
7Search QueryUsing query object, index database is then checked to get the relevant details and the content documents.
8Render ResultsOnce result is received the application should decide how to show the results to the user using User Interface. How much information is to be shown at first look and so on.
Apart from these basic operations, search application can also provide administration user interface providing administrators of the application to control the level of search based on the user profiles. Analytics of search result is another important and advanced aspect of any search application.

Lucene's role in search application

Lucene plays role in steps 2 to step 7 mentioned above and provides classes to do the required operations. In nutshell, lucene works as a heart of any search application and provides the vital operations pertaining to indexing and searching. Acquiring contents and displaying the results is left for the application part to handle. Let's start with first simple search application using lucene search library in next chapter.

No comments:

Post a Comment