A Quick Introduction to Unix/Searching Text Files

From Wikibooks, open books for an open world
Jump to navigation Jump to search


Searching the contents of a file[edit | edit source]

Simple searching using less[edit | edit source]

Using less, you can search though a text file for a keyword (pattern). For example, to search through science.txt for the word science, type

% less science.txt

This writes the first screenfull of the file to the screen. While still in less, you can type a forward slash [/] followed by the word to search

/science

As you can see, less finds and highlights the keyword. Type [n] to search for the next occurrence of the word.

This is useful but relatively limited and inflexible. It is not hard to imagine simple situations where you might want to quickly check the contents of a file (is this the essay where I talked about left recursion?). But we often want to do just a bit more. That is where the very famous Unix utility grep comes in.

grep[edit | edit source]

grep is one of many standard Unix utilities. It searches files for specified words or patterns. First clear the screen (type clear at the prompt), then type

% grep science science.txt

As you can see, grep has printed out each line containing the word science but it is case sensitive. If we type

% grep Science science.txt

we see it is distinguishes between Science and science. To ignore upper/lower case distinctions, use the -i option, i.e. type

% grep -i science science.txt

To search for a phrase or pattern (i.e. a string of characters with a space in it) you must enclose it in single quotes. For example to search for current domain, type

% grep -i 'current domain' science.txt

Some of the other options for grep are:

grep option effect
-v display those lines that do NOT match
-n precede each matching line with the line number
-c print only the total count of matched lines

Try some of them and see the different results. Don't forget, you can use more than one option at a time. For example, the number of lines without the words science or Science is

% grep -ivc science science.txt

grep is one of the most powerful Unix utilities. There are extensions such as egrep as well. A good knowledge of the power of grep can make you a very productive Unix user. This is however a quick introduction and so this is all are going to cover.

wc (word count)[edit | edit source]

A handy little utility is the wc command, short for word count. To do a word count on science.txt, type

% wc -w science.txt

To find out how many lines the file has, type

% wc -l science.txt