Guide to Unix/Explanations/awk

Wikipedia has related information at AWK

Program Structure[edit | edit source]

awk programs consist of a sequence of one or more pattern-action statements:

pattern   { action }
pattern   { action }
 :
 :

awk scans input lines of data and performs actions on those lines that match any of the specified patterns.

Running AWK[edit | edit source]

Here we call awk from a shell script awk1.sh:

#!/bin/bash

# awk1.sh
awk '
       { print }
' $1

There is no pattern, so every line fed into awk is matched and the action is invoked. Which results in every line of the file being printed on the screen. Thus awk1.sh behaves similar to cat.

To demonstrate, create the file numeric.dat with the contents:

1 one   i
2 two   ii
3 three iii
4 four  iv
5 five  v
6 six   vi
7 seven vii
8 eight viii
9 nine  ix
10 ten  x

Run awk1.sh on numeric.dat (don't forget to make the script executable):

./awk1.sh numeric.dat
1 one   i
2 two   ii
3 three iii
4 four  iv
5 five  v
6 six   vi
7 seven vii
8 eight viii
9 nine  ix
10 ten  x

(Notice how ./ is being used to execute a script.)

Expressions =[edit | edit source]

If the first field is equal to one then print the entire line

#!/bin/sh
# awk1.sh
awk '
    $1 == 1 { print $0 }
' $1

Results in:

1 one i

If the second field is equal to "two" then print the entire line:

$2 == "two" { print $0 }

Results in:

2 two ii

If the first field is greater than 5 then print the third field

$1 > 5 { print $3 }

Results in

 vi
 vii
 viii
 ix
 x

Regular Expressions[edit | edit source]

Print the input line if the pattern "ix" is matched in any field

/ix/ { print $0 }

Results in:

6 six   vi
9 nine  ix

Print the input line if the pattern "ix" is matched in the third field:

$3 ~ /ix/ { print $0 }

Results in:

9 nine  ix

Print the input lines that do not contain the pattern "x"

$0 !~ /x/ { print }

Results in:

1 one   i
2 two   ii
3 three iii
4 four  iv
5 five  v
7 seven vii
8 eight viii

Compound expressions[edit | edit source]

Print lines where the third field matches the pattern "x" OR the first field is less than or equal to 3.

$3 ~ /x/ ||  $1 <= 3  { print $0 }

Results in:

1 one   i
2 two   ii
3 three iii
9 nine  ix
10 ten  x

Print lines where the third field matches the pattern "vi" AND the second field begins with the letter "s".

$3 ~ /vi/ && $2 ~ "^s"  { print $0 }

Results in:

6 six   vi
7 seven vii

Ranges[edit | edit source]

Print lines where the second field equals "three" and where the third field equals "vii" and all subsequent lines in between:

$2 == "three", $3 == "vii" { print $0 }

Results in:

3 three iii
4 four  iv
5 five  v
6 six   vi
7 seven vii

BEGIN and END[edit | edit source]

BEGIN is a special pattern which matches before the first input line. Similarly END matches after the last input line.

BEGIN { print "start at 3..." }
$2 == "three", $2 ~ /^e/ { print $1 }
END { print "...and end at eight" }

Results in

start at 3...
3 
4 
5 
6 
7 
8 
...and end at eight

Guide to Unix/Explanations/awk

Contents

Program Structure[edit | edit source]

Running AWK[edit | edit source]

Expressions =[edit | edit source]

Regular Expressions[edit | edit source]

Compound expressions[edit | edit source]

Ranges[edit | edit source]

BEGIN and END[edit | edit source]

Navigation menu

Guide to Unix/Explanations/awk

Program Structure[edit | edit source]

Running AWK[edit | edit source]

Expressions =[edit | edit source]

Regular Expressions[edit | edit source]

Compound expressions[edit | edit source]

Ranges[edit | edit source]

BEGIN and END[edit | edit source]

Navigation menu

Search