Guide to Unix/Commands/File Analysing

From Wikibooks, open books for an open world
Jump to navigation Jump to search

file[edit | edit source]

file displays the file type. To get the mimetype, use the -i option.


$ file Unix.txt
Unix.txt: ASCII text
$ file -i Unix.txt
Unix.txt: text/plain; charset=us-ascii


wc[edit | edit source]

wc tells you the number of lines, words and characters in a file.


$ wc hello.txt
2       6      29 hello.txt
$ wc -l hello.txt
2 hello.txt
$ wc -w hello.txt
6 hello.txt
$ wc -c hello.txt
29 hello.txt


cksum[edit | edit source]

cksum gives you the CRC checksum of some files.

Checksums can be used to protect against accidental modifications to files: if the checksum has not changed, then the file is probably undamaged. The default CRC checksum is not cryptographic.

Cryptographic checksums are those checksums which protect against both accidental modifications and malicious modifications. Use these to verify that there is no trojan inserted into your file. The "md5" algorithm is beginning to show weaknesses against attacks, so "sha1" is preferred.


$ cksum /etc/passwd
3052342160 2119 /etc/passwd

Some "cksum" implementations provide other algorithms, such as "md5" and "sha1":

$ cksum -a sha1 /etc/passwd
SHA1 (/etc/passwd) = 816d937ca4cdb4dee92d5002610fae63b639d224

Some "cksum" implementations let you take checksums of strings specified as arguments:

$ cksum -s 'Guide to UNIX'
2195826759 13 Guide to UNIX
$ cksum -a sha1 -s 'Guide to UNIX'
SHA1 ("Guide to UNIX") = 0e9c1779e61c7fdb473d2e55eb878a82c37eecea


grep[edit | edit source]

Outputs lines matching a regular expression, not matching it, and similar, depending on options and the regular expression used. See Grep Wikibook.


diff[edit | edit source]

Compares file content of two files line by line and outputs differences. See also diff3.


diff3[edit | edit source]

Compares file content of three files line by line and outputs differences. See also diff.


cmp[edit | edit source]

Compares files byte by byte, outputting the byte number and the line number where a first difference is found, if any. Outputs nothing if the files are binary identical. No indication is made of the further differences beyond the first one unless option -l is used.


strings[edit | edit source]

Outputs printable strings found in files, useful when these files are binary.