An Awk Primer/Standard Functions

From Wikibooks, open books for an open world
Jump to: navigation, search

Awk includes a number of predefined functions. The simplest function is "length()", which returns the length of its parameter. If no parameter is specified, it returns the length of the input line in number of characters. For example:

   {print length, $0}

-- prints each input line, preceded by its length. When provided with a string parameter, "length()", obviously, returns the length of the string. When provided with an arithmetic parameter, "length()" returns the length of the numeric string that "print" would have printed by default, as defined by default output format, if given the same arithmetic parameter.


  • There are several predefined arithmetic functions:
   sqrt()     Square root.    log()      Base-e log.    exp()      Power of e.    int()      Integer part of argument.

The "exp()" function can be used to derive powers of numbers besides e. Given that "^" is an exponentiation operator:

   2^x

-- then if:

   2 = e^k

-- where "k" is the log to the base e of 2:

   k = log(2)

-- then:

   2^x = (e^k)^x = (e^log(2))^x = e^(x * log(2))

So Awk could compute the 20th power of 2 with:

   BEGIN {log_two = log(2); print exp(log_two * 20)}

Sine and cosine are also supported by some versions of Awk.

  • Awk, not surprisingly, includes a set of string-processing operations:
   substr()   As mentioned, extracts a substring from a string.    split()    Splits a string into its elements and stores them in an array.    index()    Finds the starting point of a substring within a string.

The "substr()" function has the syntax:

   substr(<string>,<start of substring>,<max length of substring>)

For example, to extract and print the word "get" from "unforgettable":

   BEGIN {print substr("unforgettable",6,3)}

Please be aware that the first character of the string is numbered "1", not "0". To extract a substring of at most ten characters, starting from position 6 of the first field variable, we use:

   substr($1,6,10)

The "split()" function has the syntax:

   split(<string>,<array>,[<field separator>])

This function takes a string with n fields and stores the fields into array[1], array[2], ... , array[n]. If the optional field separator is not specified, the value of FS (normally "white space", the space and tab characters) is used. For example, suppose we have a field of the form:


   joe:frank:harry:bill:bob:sil

We could use "split()" to break it up and print the names as follows:

   my_string = "joe:frank:harry:bill:bob:sil";    split(my_string,names,":");    print names[1];    print names[2];    ...

The "index()" function has the syntax:

   index(<target string>,<search string>)

-- and returns the position at which the search string begins in the target string (remember, the initial position is "1"). For example:

   index("gorbachev","bach")         returns:  4    index("superficial","super")      returns:  1    index("sunfire","fireball")       returns:  0    index("aardvark","z")             returns:  0

Personal tools
Namespaces
Variants
Actions
Navigation
Community
Toolbox
Sister projects
Print/export