Sunday, January 22, 2017

PDFBox - Overview

The Portable Document Format (PDF) is a file format that helps to present data in a manner that is independent of Application software, hardware, and operating systems.
Each PDF file holds description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it.

PDFBox - Environment

Installing PDFBox

Following are the steps to download Apache PDFBox −
Step 1 − Open the homepage of Apache PDFBox by clicking on the following link − https://pdfbox.apache.org/

PDFBox - Creating a PDF Document

Let us now understand how to create a PDF document using the PDFBox library.

Creating an Empty PDF Document

You can create an empty PDF Document by instantiating the PDDocument class. You can save the document in your desired location using the Save() method.
Following are the steps to create an empty PDF document.

PDFBox - Adding Pages

In the previous chapter, we have seen how to create a PDF document. After creating a PDF document, you need to add pages to it. Let us now understand how to add pages in a PDF document.

PDFBox - Loading a Document

In the previous examples, you have seen how to create a new document and add pages to it. This chapter teaches you how to load a PDF document that already exists in your system, and perform some operations on it.

PDFBox - Removing Pages

Let us now learn how to remove pages from a PDF document.

Removing Pages from an Existing Document

You can remove a page from an existing PDF document using the removePage() method of the PDDocument class.

PDFBox - Document Properties

Like other files, a PDF document also has document properties. These properties are key-value pairs. Each property gives particular information about the document.
Following are the properties of a PDF document −

PDFBox - Adding Text

In the previous chapter, we discussed how to add pages to a PDF document. In this chapter, we will discuss how to add text to an existing PDF document.

PDFBox - Adding Multiple Lines

In the example provided in the previous chapter we discussed how to add text to a page in a PDF but through this program, you can only add the text that would fit in a single line. If you try to add more content, all the text that exceeds the line space will not be displayed.

PDFBox - Reading Text

In the previous chapter, we have seen how to add text to an existing PDF document. In this chapter, we will discuss how to read text from an existing PDF document.

PDFBox - Inserting Image

In the previous chapter, we have seen how to extract text from an existing PDF document. In this chapter, we will discuss how to insert image to a PDF document.

PDFBox - Encrypting a PDF Document

In the previous chapter, we have seen how to insert an image in a PDF document. In this chapter, we will discuss how to encrypt a PDF document.

PDFBox - JavaScript in PDF Document

In the previous chapter, we have learnt how to insert image into a PDF document. In this chapter, we will discuss how to add JavaScript to a PDF document.

PDFBox - Splitting a PDF Document

In the previous chapter, we have seen how to add JavaScript to a PDF document. Let us now learn how to split a given PDF document into multiple documents.

PDFBox - Merging Multiple PDF Documents

In the previous chapter, we have seen how to split a given PDF document into multiple documents. Let us now learn how to merge multiple PDF documents as a single document.

PDFBox - Extracting Image

In the previous chapter, we have seen how to merge multiple PDF documents. In this chapter, we will understand how to extract an image from a page of a PDF document.

PDFBox - Adding Rectangles

This chapter teaches you how to create color boxes in a page of a PDF document.

Creating Boxes in a PDF Document

You can add rectangular boxes in a PDF page using the addRect() method of the PDPageContentStream class.

PDFBox - Quick Guide

The Portable Document Format (PDF) is a file format that helps to present data in a manner that is independent of Application software, hardware, and operating systems.
Each PDF file holds description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it.
There are several libraries available to create and manipulate PDF documents through programs, such as −
  • Adobe PDF Library − This library provides API in languages such as C++, .NET and Java and using this we can edit, view print and extract text from PDF documents.
  • Formatting Objects Processor − Open-source print formatter driven by XSL Formatting Objects and an output independent formatter. The primary output target is PDF.
  • iText − This library provides API in languages such as Java, C#, and other .NET languages and using this library we can create and manipulate PDF, RTF and HTML documents.
  • JasperReports − This is a Java reporting tool which generates reports in PDF document including Microsoft Excel, RTF, ODT, comma-separated values and XML files.

PDFBox - Useful Resources

The following resources contain additional information on PDFBox. Please use them to get more in-depth knowledge on this topic.

Useful Links on PDFBox

Discuss PDFBox

Apache PDFBox is an open-source Java library that supports the development and conversion of PDF documents. In this tutorial, we will learn how to use PDFBox to develop Java programs that can create, convert, and manipulate PDF documents.

Maven - Overview

What is Maven?

Maven is a project management and comprehension tool. Maven provides developers a complete build lifecycle framework. Development team can automate the project's build infrastructure in almost no time as Maven uses a standard directory layout and a default build lifecycle.

Maven - Environment Setup

Maven is Java based tool, so the very first requirement is to have JDK installed on your machine.

System Requirement

JDK 1.5 or above.
Memory no minimum requirement.

Maven - POM

POM stands for Project Object Model. It is fundamental Unit of Work in Maven. It is an XML file. It always resides in the base directory of the project as pom.xml.
The POM contains information about the project and various configuration detail used by Maven to build the project(s).

Maven - Build Life Cycle

What is Build Lifecycle?

A Build Lifecycle is a well defined sequence of phases which define the order in which the goals are to be executed. Here phase represents a stage in life cycle.
As an example, a typical Maven Build Lifecycle consists of following sequence of phases

Maven - Build Profiles

What is Build Profile?

A Build profile is a set of configuration values which can be used to set or override default values of Maven build. Using a build profile, you can customize build for different environments such as Production v/s Development environments.

Maven - Repositories

What is a Maven Repository?

In Maven terminology, a repository is a place i.e. directory where all the project jars, library jar, plugins or any other project specific artifacts are stored and can be used by Maven easily.
Maven repository are of three types

Maven - Plug-ins

What are Maven Plugins?

Maven is actually a plugin execution framework where every task is actually done by plugins. Maven Plugins are generally used to :

Maven - Creating Project

Maven uses archetype plugins to create projects. To create a simple java application, we'll use maven-archetype-quickstart plugin. In example below, We'll create a maven based java application project in C:\MVN folder.

Maven - Build & Test Project

What we learnt in Project Creation chapter is how to create a Java application using Maven. Now we'll see how to build and test the application.
Go to C:/MVN directory where you've created your java application. Open consumerBanking folder.You will see the POM.xml file with following contents.

Maven - External Dependencies

Now as you know Maven does the dependency management using concept of Maven Repositories. But what happens if dependency is not available in any of remote repositories and central repository? Maven provides answer for such scenario using concept of External Dependency.

Maven - Project Documents

This tutorial will teach you how to create documentation of the application in one go. So let's start, go to C:/MVN directory where you had created your java consumerBanking application. Open consumerBanking folder and execute the following mvn command.
C:\MVN>mvn site

Maven - Project Templates

Maven provides users,a very large list of different types of project templates (614 in numbers) using concept of Archetype. Maven helps users to quickly start a new java project using following command
mvn archetype:generate

Maven - Snapshots

A large software application generally consists of multiple modules and it is common scenario where multiple teams are working on different modules of same application. For example consider a team is working on the front end of the application as app-ui project (app-ui.jar:1.0) and they are using data-service project (data-service.jar:1.0).

Maven - Build Automation

Build Automation defines the scenario where dependent project(s) build process gets started once the project build is successfully completed,in order to ensure that dependent project(s) is/are stable.

Maven - Manage Dependencies

One of the core features of Maven is Dependency Management. Managing dependencies become difficult task once we've to deal with multi-module projects (consists of hundreds of modules/sub-projects). Maven provides a high degree of control to manage such scenarios.

Maven - Deployment Automation

In project development, normally a deployment process consists of following steps
  • Check-in the code from all project in progress into the SVN or source code repository and tag it.
  • Download the complete source code from SVN.

Maven - Web Application

This tutorial will teach you how to manage a web based project using version control system Maven. Here you will learn how to create/build/deploy and run a web application:

Maven - Eclispe IDE

Eclipse provides an excellent plugin m2eclipse which seamlessly integrates Maven and Eclipse together.
Some of features of m2eclipse are listed below

Maven - NetBeans

NetBeans 6.7 and newer has inbuild support for Maven. In case of previous version, Maven plugin is available in plugin Manager. We're using NetBeans 6.9 in this example.
Some of features of NetBeans are listed below

Maven - IntelliJ IDEA IDE Integration

IntelliJ IDEA has inbuild support for Maven. We're using IntelliJ IDEA Community Edition 11.1 in this example.
Some of features of IntelliJ IDEA are listed below

Maven Questions and Answers

Maven Questions and Answers has been designed with a special intention of helping students and professionals preparing for various Certification Exams and Job Interviews. This section provides a useful collection of sample Interview Questions and Multiple Choice Questions (MCQs) and their answers with appropriate explanations.

Maven - Quick Guide

Maven - Overview

What is Maven?

Maven is a project management and comprehension tool. Maven provides developers a complete build lifecycle framework. Development team can automate the project's build infrastructure in almost no time as Maven uses a standard directory layout and a default build lifecycle.

Maven - Useful Resources

The following resources contain additional information on Maven. Please use them to get more in-depth knowledge on this topic.

Discuss Maven

Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.
This tutorial will teach you how to use Maven in your day-2-day life of any project development using Java, or any other programming language.

Lucene - Overview

Lucene is simple yet powerful java based search library. It can be used in any application to add search capability to it. Lucene is open-source project. It is scalable and high-performance library used to index and search virtually any kind of text. Lucene library provides the core operations which are required by any search application. Indexing and Searching.

Lucene - Environment Setup

Environment Setup

This tutorial will guide you on how to prepare a development environment to start your work with Spring Framework. This tutorial will also teach you how to setup JDK, Tomcat and Eclipse on your machine before you setup Spring Framework:

Lucene - First Application

Let us start actual programming with Lucene Framework. Before you start writing your first example using Lucene framework, you have to make sure that you have setup your Lucene environment properly as explained in Lucene - Environment Setup tutorial. I also assume that you have a little bit working knowledge with Eclipse IDE.

Lucene - Indexing Classes

Indexing process is one of the core functionality provided by Lucene. Following diagram illustrates the indexing process and use of classes. IndexWriter is the most important and core component of the indexing process.

Lucene - Searching Classes

Searching process is again one of the core functionality provided by Lucene. It's flow is similar to that of indexing process. Basic search of lucene can be made using following classes which can also be termed as foundation classes for all search related operations.

Lucene - Indexing Process

Indexing process is one of the core functionality provided by Lucene. Following diagram illustrates the indexing process and use of classes. IndexWriter is the most important and core component of the indexing process.

Lucene - Indexing Operations

In this chapter, we'll discuss the four major operations of indexing. These operations are useful at various times and are used throughout of a software search application.

Lucene - Search Operation

Searching process is one of the core functionality provided by Lucene. Following diagram illustrates the searching process and use of classes. IndexSearcher is the most important and core component of the searching process.

Lucene - Query Programming

As we've seen in previous chapter Lucene - Search Operation, Lucene uses IndexSearcher to make searches and it uses Query object created by QueryParser as input. In this chapter, we are going to discuss various types of Query objects and ways to create them programmatically. Creating different types of Query object gives control on the kind of search to be made.

Lucene - Analysis

As we've seen in one of the previous chapter Lucene - Indexing Process, Lucene uses IndexWriter which analyzes the Document(s) using the Analyzer and then creates/open/edit indexes as required. In this chapter, we are going to discuss various types of Analyzer objects and other relevant objects which are used during analysis process. Understanding Analysis process and how analyzers work will give you great insight over how lucene indexes the documents.

Lucene - Sorting

In this chapter we will look into the sorting orders in which lucene gives the search results by default or can be manipulated as required.

Lucene - Quick Guide

Lucene is simple yet powerful java based search library. It can be used in any application to add search capability to it. Lucene is open-source project. It is scalable and high-performance library used to index and search virtually any kind of text. Lucene library provides the core operations which are required by any search application. Indexing and Searching.

Lucene - Useful Resources

The following resources contain additional information on Lucene. Please use them to get more in-depth knowledge on this topic.

Discuss Lucene

Lucene is an open source java based search library. Lucene is very popular and fast search library used in java based application to add document search capability to any kind of application in a very simple and efficient way.

log4j - Overview

log4j is a reliable, fast and flexible logging framework (APIs) written in Java, which is distributed under the Apache Software License.
log4j has been ported to the C, C++, C#, Perl, Python, Ruby, and Eiffel languages.

log4j - Installation

log4j API package is distributed under the Apache Software License, a full-fledged open source license certified by the open source initiative.
The latest log4j version, including full-source code, class files and documentation can be found at http://logging.apache.org/log4j/.

log4j - Architecture

log4j API follows a layered architecture where each layer provides different objects to perform different tasks. This layered architecture makes the design flexible and easy to extend in future.
There are two types of objects available with log4j framework.

log4j - Configuration

The previous chapter explained the core components of log4j. This chapter explains how you can configure the core components using a configuration file. Configuring log4j involves assigning the Level, defining Appender, and specifying Layout objects in a configuration file.

log4j - Sample Program

We have seen how to create a configuration file. This chapter describe how to generate debug messages and log them in a simple text file.
Following is a simple configuration file created for our example. Let us revise it once again:

log4j - Logging Methods

Logger class provides a variety of methods to handle logging activities. The Logger class does not allow us to instantiate a new Logger instance but it provides two static methods for obtaining a Logger object −
  • public static Logger getRootLogger();
  • public static Logger getLogger(String name);
The first of the two methods returns the application instance's root logger and it does not have a name.
Any other named Logger object instance is obtained through the second method by passing the name of the logger. The name of the logger can be any string you can pass, usually a class or a package name as we have used in the last chapter and it is mentioned below −
static Logger log = Logger.getLogger(log4jExample.class.getName());

Logging Methods

Once we obtain an instance of a named logger, we can use several methods of the logger to log messages. The Logger class has the following methods for printing the logging information.
# Methods and Description
1 public void debug(Object message) It prints messages with the level Level.DEBUG.
2 public void error(Object message) It prints messages with the level Level.ERROR.
3 public void fatal(Object message) It prints messages with the level Level.FATAL.
4 public void info(Object message) It prints messages with the level Level.INFO.
5 public void warn(Object message) It prints messages with the level Level.WARN.
6 public void trace(Object message) It prints messages with the level Level.TRACE.
All the levels are defined in the org.apache.log4j.Level class and any of the above mentioned methods can be called as follows −
import org.apache.log4j.Logger;

public class LogClass {
   private static org.apache.log4j.Logger log = Logger.getLogger(LogClass.class);
   
   public static void main(String[] args) {
   
      log.trace("Trace Message!");
      log.debug("Debug Message!");
      log.info("Info Message!");
      log.warn("Warn Message!");
      log.error("Error Message!");
      log.fatal("Fatal Message!");
   }
}
When you compile and run LogClass program, it would generate the following result −
Debug Message!
Info Message!
Warn Message!
Error Message!
Fatal Message!
All the debug messages make more sense when they are used in combination with levels. We will cover levels in the next chapter and then, you would have a good understanding of how to use these methods in combination with different levels of debugging.

log4j - Logging Levels

The org.apache.log4j.Level levels. You can also define your custom levels by sub-classing the Level class.
Level Description
ALL All levels including custom levels.

log4j - Log Formatting

Apache log4j provides various Layout objects, each of which can format logging data according to various layouts. It is also possible to create a Layout object that formats logging data in an application-specific way.
All Layout objects receive a LoggingEvent object from the Appender objects.

log4j - Logging in Files

To write your logging information into a file, you would have to use org.apache.log4j.FileAppender.

FileAppender Configuration

FileAppender has the following configurable parameters:

log4j - Logging in Database

The log4j API provides the org.apache.log4j.jdbc.JDBCAppender object, which can put logging information in a specified database.

log4j Questions and Answers

log4j Questions and Answers has been designed with a special intention of helping students and professionals preparing for various Certification Exams and Job Interviews. This section provides a useful collection of sample Interview Questions and Multiple Choice Questions (MCQs) and their answers with appropriate explanations.

log4j - Quick Guide

log4j is a reliable, fast and flexible logging framework (APIs) written in Java, which is distributed under the Apache Software License.
log4j has been ported to the C, C++, C#, Perl, Python, Ruby, and Eiffel languages.

log4j - Useful Resources

The following resources contain additional information on log4j. Please use them to get more in-depth knowledge on this topic.

Discuss Log4J

log4j is a reliable, fast and flexible logging framework (APIs) written in Java, which is distributed under the Apache Software License. log4j is a popular logging package written in Java. log4j has been ported to the C, C++, C#, Perl, Python, Ruby, and Eiffel languages.