A Quick Introduction to Unix/Searching Text Files
Searching the contents of a file
[edit | edit source]Simple searching using less
[edit | edit source]Using less, you can search though a text file for a keyword (pattern). For example, to search through science.txt for the word science, type
% less science.txt
This writes the first screenfull of the file to the screen. While still in less, you can type a forward slash / followed by the word to search
/science
As you can see, less finds and highlights the keyword. Type n to search for the next occurrence of the word.
This is useful but relatively limited and inflexible. It is not hard to imagine simple situations where you might want to quickly check the contents of a file (is this the essay where I talked about left recursion?). But we often want to do just a bit more. That is where the very famous Unix utility grep comes in.
grep
[edit | edit source]grep is one of many standard Unix utilities. It searches files for specified words or patterns. First clear the screen (type clear at the prompt), then type
% grep science science.txt
As you can see, grep has printed out each line containing the word science but it is case sensitive. If we type
% grep Science science.txt
we see it is distinguishes between Science and science. To ignore upper/lower case distinctions, use the -i option, i.e. type
% grep -i science science.txt
To search for a phrase or pattern (i.e. a string of characters with a space in it) you must enclose it in single quotes. For example to search for current domain, type
% grep -i 'current domain' science.txt
Some of the other options for grep are:
grep option | effect |
---|---|
-v | display those lines that do NOT match |
-n | precede each matching line with the line number |
-c | print only the total count of matched lines |
Try some of them and see the different results. Don't forget, you can use more than one option at a time. For example, the number of lines without the words science or Science is
% grep -ivc science science.txt
grep is one of the most powerful Unix utilities. There are extensions such as egrep as well. A good knowledge of the power of grep can make you a very productive Unix user. This is however a quick introduction and so this is all are going to cover.
wc (word count)
[edit | edit source]A handy little utility is the wc command, short for word count. To do a word count on science.txt, type
% wc -w science.txt
To find out how many lines the file has, type
% wc -l science.txt