The basics of public-key cryptography

Cryptography is all about enabling the secure transmission of a message between two parties. This means making sure that

Public key cryptography effectively solves both problems.

By message, incidentally, we mean not only verbal messages. For our purposes, a message will be anything that can be stored on a computer –– this may include the darkest of your confessions or just a cute picture of your cat.

The basic idea behind public-key cryptography

Let us say that two people, named Alice (A) and Bob (B), want to communicate securely. Alice is going to send a message to Bob through an insecure medium (it could be email, postal mail, telephone lines...) and she wants it to be signed and encrypted. For reasons that I will leave to your imagination, they cannot meet in person and they can only communicate though insecure media. Is there any way Alice can transmit the message to Bob securely? Of course there is.

In public-key cryptography, each of the two communicating parties has both a private key and a public key. As the names suggest, public keys are public and can be shared with anyone, but private keys are kept in secret by their owner and are never disclosed. Then, using some simple mathematics that we will later discuss, private keys are used to decrypt and sign messages, and public keys are used to encrypt messages and verify signatures.

In this way, if Alice wants to send an encrypted and signed message to Bob, she needs to:

  1. Encrypt it using Bob's public key.
  2. Sign it using her private key.

Then, once Bob has received the encrypted message, he needs to:

  1. Decrypt it using his private key.
  2. Verify the signature using Alice's private key to check that the message has not been corrupted.

In the end, Alice and Bob have been able to communicate securely without sharing any keys that, if stolen, would break the encryption. That is all thanks to the fact that public keys can be used to encrypt messages and verify signatures, and only for that. If door lock systems were analogous to public key cryptography, you would have a public key that could be used to close you home's door (and that would thus be accessible to anyone, just in case you leave your door open and some gentle neighbour wants to close it for you) and a private key that would open your door and that you would keep for yourself.

The RSA algorithm. Prime numbers save the world

I know, the idea behind public-key cryptography is marvellous and wonderful. But, understandably, you may now be wondering how it can possibly work. We will now explore the industry-standard implementation of public-key cryptography: the RSA algorithm (named after Rivest, Shamir and Adleman).

You might want to have a look at your arithmetic and algebra notes.

Key generation. The first thing we should do is generating a pair of keys (a public and a private key). The procedure for doing so is the following.

  1. We choose two random very large prime numbers \(p\) and \(q\). The larger, the better. With these numbers, we compute \(n = pq\). So far, so good.

  2. We now compute \(\varphi(n)\), where \(\varphi\) is Euler's totient function. If we had been given \(n\) and we didn't know its prime factorization (\(n = pq\)), it would be impossible –– at least with today's technology –– to compute \(\varphi(n)\) in any reasonable amount of time for sufficiently large \(p\) and \(q\). That is the essence of this algorithm. Nonetheless, as we know that \(n = pq\), we can use the well-known formula \[ \varphi(n) = (p-1)(q-1).\]

  3. Now, we choose an encryption number \(e\). This will be any natural number smaller than \(\varphi(n)\) and coprime with it. In other words, it will be the smallest positive representative of any element of the multiplicative group modulo \(\varphi(n)\).

  4. Finally, we compute the decryption number \(d\), which will be the inverse of \(e\) modulo \(\varphi(n)\). This is, the smallest natural such that \[ ed \equiv 1 \pmod{\varphi(n)}.\] This can be done efficiently using Euclid's algorithm.

The public key will be \((n, e)\) and the private key will be \(d\). The numbers \(p\), \(q\) and \(\varphi(n)\) must be kept secret or destroyed, because, as we have seen, they easily lead to the private key.

Encryption and decryption. Let us assume that we have a message that we have codified as a natural number \(m\) smaller than \(n\). The encrypted message, obtained using the public key of the intended recipient, will be \[ c = m^e \bmod n.\] This codified (encrypted) message can now only be deciphered with the private key associated to the public key it was encrypted with. How? Using this simple formula: \[ c^d = m^{ed} = m^{\kappa\phi(n) + 1} \equiv m \pmod{n}.\] Notice that we have used Euler's theorem, which states that, given any integer \(a\) coprime with another integer \(x\), we have \[ a^{\varphi(x)} \equiv 1 \pmod{x}.\] The theorem is an immediate consequence of Lagrange's theorem for groups.

If you look closely, you will realize that there are two particular situations in which the method can fail: \(m = p\) and \(m = q\). Had you noticed that? In this scenario, Euler's theorem doesn't guarantee that \[ m^{\phi(n) + 1} \equiv m \pmod{n}.\] Nonetheless, this isn't really a big deal. As we will be choosing really large \(p\) and \(q\), it is extremely unlikely that we will encounter this corner case. Moreover, even if we ended up in this odd scenario, we could always detect it in the encryption process and switch to an alternative encoding method that would yield a different \(m\).

Signing and verification. Now that we know how to encrypt and decrypt messages, let us tackle the remaining question of how to sign them and how to verify signatures. This involves no new mathematical machinery. Let us say that, as before, we have a message \(m\). How do we sign it? Easy. The signature will be \(s = m^d \bmod n\). How do we verify the signature? Checking that \(s^d \bmod n = m\). It is the same idea that we used before.

On a side note, there is no need to sign the whole message. If you want to save some space and make the signing and verification process more efficient, you can use a hash of the message instead.

Using GNU Privacy Guard (GPG)

The GNU Privacy Guard software is an open-source, free implementation of public-key cryptography. If you use a Linux distribution, it probably came pre-installed with it. If you use macOS, you can get the latest version through Homebrew. I will assume that you have already installed it and that you are running version 2.2.

Key generation. The first thing we will do is generating a key pair so that we can sign messages and receive encrypted messages. This is done with the command

gpg --full-generate-key

You can also use the alternative flag gpg --generate-key if you don't want to be prompted with options and would rather use the defaults.

Firstly, you will be asked to choose the encryption and signing algorithm that you want to use (we will pick RSA for both) and the size of the key. A longer key is more secure, so we will ask for a size of 4096 bits (the largest available).

Please select what kind of key you want:
(1) RSA and RSA (default)
(2) DSA and Elgamal
(3) DSA (sign only)
(4) RSA (sign only)
(14) Existing key from card
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 4096
Requested keysize is 4096 bits

Your private key may, at some point, be compromised. Somebody may steal it or you may simply lose it. That is why it is a good idea to set an expiration date for the key. This way, if someone uses it after the expiration date, a message will pop up letting them know that the key has expired and is no longer safe to use. Now GPG will ask you how long you want your key to be valid. I will set it to 1 year.

Please specify how long the key should be valid.
0 = key does not expire
<n> = key expires in n days
<n>w = key expires in n weeks
<n>m = key expires in n months
<n>y = key expires in n years
Key is valid for? (0) 1y

Then, you will be asked to provide your real name and email address. These will be used to identify your key pair and will be visible to anyone in possession of your public key. Lastly, you will have to provide a passphrase for your key. This passphrase will be necessary to sign and decrypt any messages, and it is meant to serve as an additional layer of protection should your private key ever be stolen, so you better choose a complex password.

Running gpg --list-keys will show you a list of all the public keys gpg knows of. The command gpg --list-secret-keys will display the list of known private keys. If you've done everything correctly, these commands should list you newly generated public and private keys.

Encryption and decryption. We will now learn how to encrypt a file to send it to someone (let him be Bob). Firstly, we need to obtain Bob's public key. We could ask him for it directly or look for it on his website or on a public key server (such as keys.openpgp.org). Once we have downloaded the key file and stored it as, for instance, bobkey, we can import it using gpg --import bobkey. You can use my public key for this example if you can't find yourself a Bob with a public key. If we now run gpg --list-keys, we should be able to see it: it will be identified with the name and email that he provided in the key generation process.

Now we are ready to encrypt our secret file. Let it be, for instance, cutest-cat-ever.jpeg. The command we need to run to encrypt it is the following.

gpg --output encrypted-cat.gpg --encrypt --recipient bob@example.org cutest-cat-ever.jpeg

This will generate a file called encrypted-cat.gpg that will be the result of encrypting cutest-cat-ever.jpeg with Bob's key. Note that the argument of --recipient, that we have set as bob@example.org, should be replaced by the real name or email associated to the public key we are using to encrypt the file.

That's it for encryption. Now how do we decrypt a file for which we have a private key for decryption? Easy. If, as in our previous example, the file is encrypted-cat.gpg, we just need to run

gpg --output cutest-cat-ever.jpeg --decrypt encrypted-cat.gpg

If we have the appropriate private key –– even if we have several keys, GPG will find the right one, –– this will generate the decrypted file cutest-cat-ever.jpeg. We have assumed that the file has a .jpeg extension, but, of course, it could be anything.

Signing and verification. There are three ways to sign a file. The first of them, the standard method, transforms the document in a signed file which can be used to verify the signature and retrieve the original contents of the file. To sign a file this way, use this command.

gpg --output signed-cat.sig --sign cutest-cat-ever.jpeg

Of course, signed-cat.sig is the name of the output file and cutest-cat-ever.jpeg that of the file that needs to be signed. The signature can be verified and the document can be recovered with the following command.

gpg --output cutest-cat-ever.gpg --decrypt signed-cat.sig

If you just want to verify the signature but are not interested in retrieving the message, you can run this:

gpg --verify signed-cat.sig

There is no need to specify which user signed the document: GPG will try with all the public keys it knows of and will use the right one.

This way of signing files has one major inconvenience: if someone who receives the signed file doesn't have the corresponding public key, they will be unable to see the contents of the file. One solution to this problem is ussing a clearsign signature. This will produce a file output of the form

-----BEGIN PGP SIGNED MESSAGE-----
< ORIGINAL MESSAGE >
-----BEGIN PGP SIGNATURE-----
< SIGNATURE >
-----END PGP SIGNATURE-----

To clearsign a document, you need to do the same as to sign it but using the flag --clearsign instead of --sign. The verification of the signature works the same way.

There is one last signing method. In the two approaches we have considered, the receiver always needs to perform some particular operation to retrieve the original message: decrypting a signed copy or editing a clearsigned one to remove the signature. A useful alternative would thus be storing the signature in a file that is fully independent of the message and then using GPG on the original file and the signature file to verify it. This independent signature is known as a detached signature and is generated using the --detach-sign flag. Then, if the original file is cutest-cat-ever.jpeg and the detached signature is signature.sig, its verification can be done using this command: gpg --verify signature.sig cutest-cat-ever.jpeg

Key management. If you want to export a public key associated to a particular email email@example.com to a file pubkey, you may do so with this command:

gpg --output pubkey --export email@example.com

If you want the export to be in human-readable ASCII, you can use the flag --armour:

gpg --output pubkey --armour --export email@example.com

Not providing an output will lead to the key's being printed on the terminal. If you want to export a secret key associated to a certain address, the procedure is analogous with the exception of using the --export-secret-key flag instead of --export-key. You can use this to keep a back-up of your key pair on an external drive kept in a secure place.

What else?

If you want to have a deeper look at the RSA algorithm, you can read this paper. Of course, Wikipedia is also your friend!

If you want to learn more about the use of GPG, you should check out its wonderful documentation. Some things that we haven't covered about GPG and that you can read about on your on are