Cryptography/Hashes
A Wikibookian suggests that Cryptography/Hash function be merged into this book or chapter. Discuss whether or not this merger should happen on the discussion page. 
A digest, sometimes called a hash, is the result of the application of a hash function (a very specific mathematical function or algorithm) that takes in some arbitrary value and produces a hash value, based on the given input.
Information security often includes situations where a user wants to transform one block of information into another block of information in such a way that the original block can not be recreated. It is also required that every time the input block is processed, it will produce the same output block. This means that the process is deterministic.
Such processes behave similar to a hash function and so are typically called cryptographic hashes. These hashes are used in serving authentication and integrity goals of cryptography. A cryptographic hash can be described as and has property that the hash function is one way. A given hash value can not feasibly be reversed to get a message that produces that hash value. I.e. There is no useful inverse hash function
This property can be formally expanded to provide the following properties of a secure hash:
 Preimage resistant : Given H it should be hard to find M such that H = hash(M).
 Second preimage resistant: Given an input m1, it should be hard to find another input, m2 (not equal to m1) such that hash(m1) = hash(m2).
 Collisionresistant: it should be hard to find two different messages m1 and m2 such that hash(m1) = hash(m2). Because of the birthday paradox this means the hash function must have a larger image than is required for preimageresistance.
A hash function is the implementation of an algorithm that, given some data as input, will generate a short result called a digest.
For Ex: If our hash function is 'X' and we have 'wiki' as our input... then X('wiki')= a5g78 i.e. some hash value.
Qualities of a good hash function are
1. Produces a fixed length key for variable input
2. Has got infinite key space, implies the next point
3. No collisions (i.e. no two different pieces of input give the same key value)
Applications of hash functions[edit]
Noncryptographic hash functions have many applications,^{[1]} but in this book we focus on applications that require cryptographic hash functions:
A typical use of a cryptographic hash would be as follows: Alice poses to Bob a tough math problem and claims she has solved it. Bob would like to try it himself, but would yet like to be sure that Alice is not bluffing. Therefore, Alice writes down her solution, appends a random nonce, computes its hash and tells Bob the hash value (whilst keeping the solution secret). This way, when Bob comes up with the solution himself a few days later, Alice can verify his solution but still be able to prove that she had the solution earlier.
In actual practice, Alice and Bob will often be computer programs, and the secret would be something less easily spoofed than a claimed puzzle solution. The above application is called a commitment scheme. Another important application of secure hashes is verification of message integrity. Determination of whether or not any changes have been made to a message (or a file), for example, can be accomplished by comparing message digests calculated before, and after, transmission (or any other event) (see Tripwire, a system using this property as a defense against malware and malfeasance). A message digest can also serve as a means of reliably identifying a file.
A related application is password verification. Passwords are usually not stored in cleartext, for obvious reasons, but instead in digest form. We discuss password handling  in particular, why hashing the password once is inadequate  in more detail in a later chapter, Password handling.
A hash function is a key part of message authentication (HMAC).
Most distributed version control systems (DVCSs) use cryptographic hashes.^{[2]}
For both security and performance reasons, most digital signature algorithms specify that only the digest of the message be "signed", not the entire message. Hash functions can also be used in the generation of pseudorandom bits.
SHA1, MD5, and RIPEMD160 are among the most commonlyused message digest algorithms as of 2004. In August 2004, researchers found weaknesses in a number of hash functions, including MD5, SHA0 and RIPEMD. This has called into question the longterm security of later algorithms which are derived from these hash functions. In particular, SHA1 (a strengthened version of SHA0), RIPEMD128 and RIPEMD160 (strengthened versions of RIPEMD). Neither SHA0 nor RIPEMD are widely used since they were replaced by their strengthened versions.
Other common cryptographic hashes include SHA2 and Tiger.
Later we will discuss the "birthday attack" and other techniques people use for Breaking Hash Algorithms.
Hash speed[edit]
There are two contradictory requirements for cryptographic hash speed:
 When using hashes for password verification, people prefer hash functions that take a long time to run. If/when a password verification database (the
/etc/passwd
file, the/etc/shadow
file, etc.) is accidentally leaked, they want to force a bruteforce attacker to take a long time to test each guess.^{[3]}
Some popular hash functions in this category are

 scrypt
 bcrypt
 PBKDF2
 When using hashes for file verification, people prefer hash functions that run very fast. They want a corrupted file can be detected as soon as possible (and queued for retransmission, quarantined, or etc.).
Some popular hash functions in this category are

 SHA256
 SHA3
Further reading[edit]
 see Algorithm Implementation/Hashing for more about noncryptographic hash functions and their applications.
 see Data Structures/Hash Tables for the most common application of noncryptographic hash functions
 ↑ applications of noncryptographic hash functions are described in Data Structures/Hash Tables and Algorithm Implementation/Hashing.
 ↑ Eric Sink. "Version Control by Example". Chapter 12: "Git: Cryptographic Hashes".
 ↑ "Speed Hashing"