Thursday, January 26, 2017

AWK - Quick Guide

AWK - Overview

AWK is an interpreted programming language. It is very powerful and specially designed for text processing. Its name is derived from the family names of its authors − Alfred Aho, Peter Weinberger, and Brian Kernighan.

The version of AWK that GNU/Linux distributes is written and maintained by the Free Software Foundation (FSF); it is often referred to as GNU AWK.

Types of AWK

Following are the variants of AWK −
  • AWK − Original AWK from AT & T Laboratory.
  • NAWK − Newer and improved version of AWK from AT & T Laboratory.
  • GAWK − It is GNU AWK. All GNU/Linux distributions ship GAWK. It is fully compatible with AWK and NAWK.

Typical Uses of AWK

Myriad of tasks can be done with AWK. Listed below are just a few of them −
  • Text processing,
  • Producing formatted text reports,
  • Performing arithmetic operations,
  • Performing string operations, and many more.

AWK - Environment

This chapter describes how to set up the AWK environment on your GNU/Linux system.

Installation Using Package Manager

Generally, AWK is available by default on most GNU/Linux distributions. You can use which command to check whether it is present on your system or not. In case you don’t have AWK, then install it on Debian based GNU/Linux using Advance Package Tool (APT) package manager as follows −
[jeryy]$ sudo apt-get update
[jeryy]$ sudo apt-get install gawk
Similarly, to install AWK on RPM based GNU/Linux, use Yellowdog Updator Modifier yum package manager as follows −
[root]# yum install gawk
After installation, ensure that AWK is accessible via command line.
[jerry]$ which awk
On executing the above code, you get the following result −
/usr/bin/awk

Installation from Source Code

As GNU AWK is a part of the GNU project, its source code is available for free download. We have already seen how to install AWK using package manager. Let us now understand how to install AWK from its source code.
The following installation is applicable to any GNU/Linux software, and for most other freely-available programs as well. Here are the installation steps −
Step 1 − Download the source code from an authentic place. The command-line utility wget serves this purpose.
[jerry]$ wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.1.tar.xz
Step 2 − Decompress and extract the downloaded source code.
[jerry]$ tar xvf gawk-4.1.1.tar.xz
Step 3 − Change into the directory and run configure.
[jerry]$ ./configure
Step 4 − Upon successful completion, the configure generates Makefile. To compile the source code, issue a make command.
[jerry]$ make
Step 5 − You can run the test suite to ensure the build is clean. This is an optional step.
[jerry]$ make check
Step 6 − Finally, install AWK. Make sure you have super-user privileges.
[jerry]$ sudo make install
That is it! You have successfully compiled and installed AWK. Verify it by executing the awk command as follows −
[jerry]$ which awk
On executing this code, you get the following result −
/usr/bin/awk

AWK - Workflow

To become an expert AWK programmer, you need to know its internals. AWK follows a simple workflow − Read, Execute, and Repeat. The following diagram depicts the workflow of AWK −
AWK Workflow

Read

AWK reads a line from the input stream (file, pipe, or stdin) and stores it in memory.

Execute

All AWK commands are applied sequentially on the input. By default AWK execute commands on every line. We can restrict this by providing patterns.

Repeat

This process repeats until the file reaches its end.

Program Structure

Let us now understand the program structure of AWK.

BEGIN block

The syntax of the BEGIN block is as follows −
Syntax
BEGIN {awk-commands}
The BEGIN block gets executed at program start-up. It executes only once. This is good place to initialize variables. BEGIN is an AWK keyword and hence it must be in upper-case. Please note that this block is optional.

No comments:

Post a Comment