Cryptography/RSA

RSA is an asymmetric algorithm for public key cryptography, widely used in electronic commerce. The algorithm was described in 1977 by Ron Rivest, Adi Shamir and Len Adleman; the letters RSA are the initials of their surnames.

Clifford Cocks, a British mathematician working for GCHQ, described an equivalent system in an internal document in 1973. His discovery, however, was not revealed until 1997 due to its top-secret classification.

The security of the RSA system relies on the difficulty of factoring very large integers. New fast algorithms in this field could render RSA insecure, but this is generally considered unlikely.

The algorithm was patented by MIT in 1983 in the United States of America. The patent expired 21 September 2000. Since the algorithm had been published prior to the patent application, it could not be patented in other countries.

Operation

Key Generation

Suppose a user Alice wishes to allow Bob to send her a private message over an insecure transmission medium. She takes the following steps to generate a public key and a private key:

Choose two large prime numbers p ≠ q randomly and independently of each other. Compute N = p q.
Choose an integer 1 < e < N which is coprime to (p-1)(q-1).
Compute d such that d e ≡ 1 (mod (p-1)(q-1)).
Destroy all records of p and q.

(Steps 2 and 3 can be performed with the extended Euclidean algorithm; see modular arithmetic. Additionally, solving for either e or d may be performed using the diophantine equation $ed-k\phi (n)=1$ .)

N and e are the public key, and N and d are the private key. Note that only d is a secret as N is known to the public. Alice transmits the public key to Bob, and keeps the private key secret.

You can generate and examine a real RSA keypair using OpenSSL and some Unix utilities. ( Cryptography/Generate a keypair using OpenSSL ).

Encrypting messages

Suppose Bob wishes to send a message m to Alice. He knows N and e, which Alice has announced. He turns m into a number n < N, using some previously agreed-upon reversible protocol. For example, each character in a plaintext message could be converted to its ASCII code, and the codes concatenated into a single number. If necessary, he can break m into pieces and encrypt each piece separately. He then computes the ciphertext c:

c\equiv n^{e}\ (\mathrm {mod} \ N)

This can be done quickly using the method of exponentiation by squaring. Bob then transmits c to Alice.

Decrypting messages

Alice receives c from Bob, and knows her private key d. She can recover n from c by the following procedure:

c^{d}\equiv n\ (\mathrm {mod} \ N)

Alice can then extract n, since n < N. Given n, she can recover the original message m.

The decryption procedure works because

c^{d}\equiv n^{e\cdot d}\ (\mathrm {mod} \ N)

and ed ≡ 1 (mod p-1) and ed ≡ 1 (mod q-1). Fermat's little theorem yields

n^{e\cdot d}\equiv n\ (\mathrm {mod} \ p)

and

n^{e\cdot d}\equiv n\ (\mathrm {mod} \ q)

which implies (as p and q are different prime numbers)

n^{e\cdot d}\equiv n\ (\mathrm {mod} \ pq)

Signing Messages

RSA can also be used to sign a message. Suppose Alice wishes to send a signed message to Bob. She produces a hash value of the message, encrypts it with her secret key, and attaches it as a "signature" to the message. This signature can only be decrypted with her public key. When Bob receives the signed message, he decrypts the signature with Alice's public key, and compares the resulting hash value with the message's actual hash value. If the two agree, he knows that the author of the message was in possession of Alice's secret key, and that the message has not been tampered with since.

Security

Suppose Eve, an eavesdropper, intercepts the public key N and e, and the ciphertext c. However, she is unable to directly obtain d, which Alice keeps secret. The most obvious way for Eve to deduce n from c is to factor N into p and q, in order to compute (p-1)(q-1) which allows the determination of d from e. No polynomial-time method for factoring large integers on a classical computer has yet been found, but it has not been proven that none exists. See integer factorization for a discussion of this problem.

It has not been proven that factoring N is the only way of deducing n from c, but no easier method has been discovered (at least to public knowledge.)

Therefore, it is generally presumed that Eve is defeated in practice if N is sufficiently large.

If N is 256 bits or shorter, it can be factored in a few hours on a personal computer, using software already freely available. If N is 512 bits or shorter, it can be factored by several hundred computers as of 1999. It is currently recommended that N be at least 1024 bits long.

In 1993, Peter Shor showed that a quantum computer could in principle perform the factorization in polynomial time. If (or when) quantum computers become a practical technology, Shor's algorithm will make RSA and related algorithms obsolete.

Should an efficient classical factorization code be discovered or a practical quantum computer constructed, using still larger key lengths would provide a stopgap measure. However, any such security break in RSA would obviously be retroactive. An eavesdropper who had recorded a public key and any ciphertext produced with it (easily found by just recording traffic to that public key's owner), could simply wait until such a breakthrough. And then decipher that ciphertext into the plaintext message. Therefore, it is inherently unsafe to exchange long-term secrets with RSA or any cipher with similar vulnerabilities.

Practical considerations

Key generation

Finding the large primes p and q is usually done by testing random numbers of the right size with probabilistic primality tests which quickly eliminate most non-primes. If such a test finds a "probable prime", a deterministic test should then be used to verify that the number is indeed prime.

p and q should not be 'too close', lest the Fermat factorization for N be successful. Furthermore, if either p-1 or q-1 has only small prime factors, N can be factored quickly and these values of p or q should therefore be discarded as well.

One should not employ a prime search method which gives any information whatsoever about the primes to the attacker. In particular, a good random number generator for the start value needs to be employed. Note that the requirement here is both 'random' and 'unpredictable'. These are not the same criteria; a number may have been chosen by a random process (i.e., no pattern in the results), but if it is predictable in any manner (or even partially predicatable), the method used will result in loss of security. For example, the random number table published by the Rand Corp in the 1950s might very well be truly random, but it has been published and thus can serve an attacker as well. If the attacker can guess half of the digits of p or q, they can quickly compute the other half (shown by Coppersmith in 1997).

It is important that the secret key d be large enough. Wiener showed in 1990 that if p is between q and 2q (which is quite typical) and d < N^1/4/3, then d can be computed efficiently from N and e. The encryption key e = 2 should also not be used.

Speed

RSA is much slower than DES and other symmetric cryptosystems. In practice, Bob typically encrypts a secret message with a symmetric algorithm, encrypts the (comparatively short) symmetric key with RSA, and transmits both the RSA-encrypted symmetric key and the symmetrically-encrypted message to Alice.

This procedure raises additional security issues. For instance, it is of utmost importance to use a strong random number generator for the symmetric key, because otherwise Eve could bypass RSA by guessing the symmetric key.

Key distribution

As with all ciphers, it is important how RSA public keys are distributed. Key distribution must be secured against a man-in-the-middle attack. Suppose Eve has some way to give Bob arbitrary keys and make him believe they belong to Alice. Suppose further that Eve can intercept transmissions between Alice and Bob. Eve sends Bob her own public key, which Bob believes to be Alice's. Eve can then intercept any ciphertext sent by Bob, decrypt it with her own secret key, keep a copy of the message, encrypt the message with Alice's public key, and send the new ciphertext to Alice. In principle, neither Alice nor Bob would be able to detect Eve's presence. Defenses against such attacks are often based on digital certificates or other components of a public key infrastructure.

Timing attacks

Kocher described an ingenious unexpected new attack on RSA in 1995: if the attacker Eve knows the hardware of Alice and is able to measure the decryption times for several known cyphertexts, she can deduce the decryption key d quickly. To thwart this attack, the decryption code should decrypt in constant time. This is known as RSA blinding.

Adaptive Chosen Ciphertext Attacks

In 1998, Daniel Bleichenbacher described the first practical adaptive chosen ciphertext attack, against RSA-encrypted messages using the PKCS #1 v1 redundancy function (a redundancy function adds structure to an RSA-encrypted message, so it is possible to determine whether a decrypted message is valid.) Due to flaws with the PKCS #1 scheme, Bleichenbacher was able to mount a practical attack against RSA implementations of the Secure Socket Layer protocol, and potentially reveal session keys. As a result of this work, cryptographers now recommend the use of provably secure redundancy checks such as Optimal Asymmetric Encryption Padding, and RSA Laboratories has released new versions of PKCS #1 that are not vulnerable to these attacks.