Lucene is simple yet powerful java based search library. It can be
used in any application to add search capability to it. Lucene is
open-source project. It is scalable and high-performance library used to
index and search virtually any kind of text. Lucene library provides
the core operations which are required by any search application.
Indexing and Searching.
Apart from these basic operations, search application can also
provide administration user interface providing administrators of the
application to control the level of search based on the user profiles.
Analytics of search result is another important and advanced aspect of
any search application.
If you are running Windows and installed the JDK in C:\jdk1.6.0_15, you would have to put the following line in your C:\autoexec.bat file.
On Unix (Solaris, Linux, etc.), if the SDK is installed in /usr/local/jdk1.6.0_15 and you use the C shell, you would put the following into your .cshrc file.
To install Eclipse IDE, download the latest Eclipse binaries from http://www.eclipse.org/downloads/. Once you downloaded the installation, unpack the binary distribution into a convenient location. For example in C:\eclipse on windows, or /usr/local/eclipse on Linux/Unix and finally set PATH variable appropriately.
Eclipse can be started by executing the following commands on windows machine, or you can simply double click on eclipse.exe
How Search Application works?
Any search application does the few or all of the following operations.Step | Title | Description |
---|---|---|
1 | Acquire Raw Content | First step of any search application is to collect the target contents on which search are to be conducted. |
2 | Build the document | Next step is to build the document(s) from the raw contents which search application can understands and interpret easily. |
3 | Analyze the document | Before indexing process to start, the document is to be analyzed as which part of the text is a candidate to be indexed. This process is called analyzing the document. |
4 | Indexing the document | Once documents are built and analyzed, next step is to index them so that this document can be retrived based on certain keys instead of whole contents of the document. Indexing process is similar to indexes in the end of a book where common words are shown with their page numbers so that these words can be tracked quickly instead of searching the complete book. |
5 | User Interface for Search | Once a database of indexes is ready then application can make any search. To facilitate user to make a search, application must provide a user a mean or u0ser interface where a user can enter text and start the search process. |
6 | Build Query | Once user made a request to search a text, application should prepare a Query object using that text which can be used to inquire index database to get the relevant details. |
7 | Search Query | Using query object, index database is then checked to get the relevant details and the content documents. |
8 | Render Results | Once result is received the application should decide how to show the results to the user using User Interface. How much information is to be shown at first look and so on. |
Lucene's role in search application
Lucene plays role in steps 2 to step 7 mentioned above and provides classes to do the required operations. In nutshell, lucene works as a heart of any search application and provides the vital operations pertaining to indexing and searching. Acquiring contents and displaying the results is left for the application part to handle. Let's start with first simple search application using lucene search library in next chapter.Lucene - Environment Setup
Environment Setup
This tutorial will guide you on how to prepare a development environment to start your work with Spring Framework. This tutorial will also teach you how to setup JDK, Tomcat and Eclipse on your machine before you setup Spring Framework:Step 1 - Setup Java Development Kit (JDK):
You can download the latest version of SDK from Oracle's Java site: Java SE Downloads. You will find instructions for installing JDK in downloaded files, follow the given instructions to install and configure the setup. Finally set PATH and JAVA_HOME environment variables to refer to the directory that contains java and javac, typically java_install_dir/bin and java_install_dir respectively.If you are running Windows and installed the JDK in C:\jdk1.6.0_15, you would have to put the following line in your C:\autoexec.bat file.
set PATH=C:\jdk1.6.0_15\bin;%PATH% set JAVA_HOME=C:\jdk1.6.0_15Alternatively, on Windows NT/2000/XP, you could also right-click on My Computer, select Properties, then Advanced, then Environment Variables. Then, you would update the PATH value and press the OK button.
On Unix (Solaris, Linux, etc.), if the SDK is installed in /usr/local/jdk1.6.0_15 and you use the C shell, you would put the following into your .cshrc file.
setenv PATH /usr/local/jdk1.6.0_15/bin:$PATH setenv JAVA_HOME /usr/local/jdk1.6.0_15Alternatively, if you use an Integrated Development Environment (IDE) like Borland JBuilder, Eclipse, IntelliJ IDEA, or Sun ONE Studio, compile and run a simple program to confirm that the IDE knows where you installed Java, otherwise do proper setup as given document of the IDE.
Step 2 - Setup Eclipse IDE
All the examples in this tutorial have been written using Eclipse IDE. So I would suggest you should have latest version of Eclipse installed on your machine.To install Eclipse IDE, download the latest Eclipse binaries from http://www.eclipse.org/downloads/. Once you downloaded the installation, unpack the binary distribution into a convenient location. For example in C:\eclipse on windows, or /usr/local/eclipse on Linux/Unix and finally set PATH variable appropriately.
Eclipse can be started by executing the following commands on windows machine, or you can simply double click on eclipse.exe
%C:\eclipse\eclipse.exeEclipse can be started by executing the following commands on Unix (Solaris, Linux, etc.) machine:
$/usr/local/eclipse/eclipseAfter a successful startup, if everything is fine then it should display following result:
No comments:
Post a Comment