The Developer’s Guide to Cryptography: The Basics

It is rare that a developer will finish his or her career without seeing, hearing about, or using cryptography. In today’s connected world, some form of cryptography is almost certainly a requirement for most applications.

While you may need to use cryptography, how much do you actually understand it? If someone tells you that they store their passwords using MD5, do you know what that means?

Do you know why AES-256 is higher grade encryption than AES-128? Or why DES is not considered secure? These are the kinds of questions you may need to answer as you grow in your career.

Do you need this amount of detail? At the very least, you may need to just know what algorithm to use. On the other hand, working at a smaller company or startup may afford you the chance to design a new system or application from scratch. In that case, knowing some details could definitely come in handy.

As a developer, you are responsible for protecting your users’ data and private information. Don’t take that responsibility lightly.

What is cryptography?

Let’s start at the beginning. Just what does cryptography mean? Cryptography aims to change a message such that someone who intercepts it will not be able to read it. The only way to read the message is to have the correct key and algorithm.

Cryptography has been around for quite some time. You may have heard of the Caesar Cipher. Julius Caesar used this cipher (we’ll talk about what cipher means in a minute) to encrypt military messages.

Now that we have powerful computers, cryptography has changed. Storing sensitive data in a digital format means we have to protect it for longer and with more powerful methods. Understanding these methods will help developers to protect the information in their care.

But first, let’s take a look at the basic building blocks of cryptography.

Cryptographic primitives

In order to understand cryptography, you must first understand the following basic terms. They will be used to explain more advanced concepts later on.

Plaintext

Plaintext is the actual message that you want to protect. In the physical world, it could be military communications written on paper. In the digital world, it is likely to be sensitive information like a social security number, credit card number, or private health information.

Ciphertext

Ciphertext is the term used to describe the message after it has been scrambled. Ciphertext that is intercepted by a third party should be unreadable by that third party. Ciphertext should also not leak information about the plaintext.

Cipher

Cipher is a synonym for the algorithm used turn the plaintext into ciphertext. There are several types of ciphers, each using a slightly different way of transforming plaintext into ciphertext. We’ll discuss the different types of ciphers later.

Encrypt

Encryption is the process of turning the plaintext into ciphertext. Another term to describe this process is encipher. We can say that a cipher encrypts plaintext into ciphertext.

Decrypt

Decryption reverses the process of encryption, transforming ciphertext back into the original plaintext message. Another term used is decipher.

Key

The key is a piece of information, usually a number, that is used along with the cipher to encrypt or decrypt the message. The key is the “secret sauce” of an algorithm and often needs to be kept secret in order to keep the information secure.

Key space

The key space is the total number of possible keys that could be used. In the digital world, the key space is determined by how much space is allocated to the key. For instance, DES uses 56-bit keys. That gives a key space of 256. Some algorithms can accommodate multiple key sizes, which in turn changes the key space.

Types of ciphers

Ciphers come in different flavors. Some of the advanced encryption algorithms in use today use several of these ciphers in combination to encrypt data.

You can think of a cipher as a function that takes the key and plaintext as arguments and returns the ciphertext as output.

fn(key, plaintext) => ciphertext

Several rounds of these function calls can be used along with the basic cipher types to create strong encryption algorithms.

Substitution ciphers

The most basic cipher is the substitution cipher. In this type of cipher, each letter of the plaintext is substituted by some letter of ciphertext according to an algorithm.

There are two types: single-alphabet and multi-alphabet ciphers. In a single-alphabet cipher, a given letter of plaintext is always substituted by the same letter of ciphertext. For example, a is always substituted by f.

A multi-alphabet cipher can have multiple possible ciphertexts for a given letter of plaintext. For instance, the letter a could be substituted for either an f or an m.

It would help to look at an example. The Caesar cipher is an example of a single-alphabet substitution cipher. To perform the encryption, a given letter is shifted a certain number of letters. The number of letters to shift is the key for this cipher.

So if the key is 3, we shift all letters of the plaintext 3 letters to the right. So a becomes d, t becomes w, and so on.

With this setup, the message “Meet tomorrow at noon” becomes “Phhw wrpruurz dw qrrq”.

An example of a multi-alphabet substitution cipher is the Vigenere cipher. Using multiple alphabets makes it much more difficult to figure out the plaintext given the ciphertext.

Notice how in the single cipher you see the same letters popping up in the ciphertext. If you know that each matches one and only one plaintext letter, you can use this knowledge to attack the cipher. Commonly used letters, such as e or t can be identified in the ciphertext and used to predict the plaintext message.

Transposition ciphers

Transposition ciphers work in an slightly different way. Instead of simply substituting one letter for another, you transpose, or mix around, the letters in a repeatable and reversible way.

An example will help this concept become more concrete. Let’s look at a geometric shape cipher.

First we have a message:

geometric_shape_plaintext

Then we rearrange the letters in a geometric shape, such as a rectangle.

geometric_cipher

Next, we trace a path through the shape to create the ciphertext. For this example, let’s go down and up each column of text.

geomtetric_cipher_directions

Finally, we write out the ciphertext by following the arrows.

geometric_ciphertext

Block ciphers

Let’s now discuss two techniques used to process the plaintext as it is being transformed into ciphertext, block ciphers and stream ciphers.

Block ciphers take the plaintext input into the cipher and process it as blocks of data. The size of the blocks matches the size of the key. So a 56-bit keyed cipher will divide the plaintext into 56-bit “chunks” of data.

Block ciphers have different modes of operation. One mode, called electronic code book (ECB), encrypts each block of text separately and then concatenates the output of each block to create the ciphertext.

One downside of ECB mode is that the same plaintext block is encrypted into the same ciphertext block. This reduces the randomness provided by the algorithm and can preserve patterns that make it easier to break.

A better mode of operation is Cipher Block Chaining (CBC). CBC creates randomness by XORing the next block of plaintext with the previous block of ciphertext before encryption. This technique makes each block of ciphertext dependent on each previous plaintext block.

In order to keep each message unique, an initialization vector is passed into the first block. Wikipedia has a great diagram of this process:

cbc_encryption

Stream ciphers

The alternative to block ciphers are stream ciphers. Stream ciphers do not break up the plaintext before encrypting it.

You can think of a stream cipher as an assembly line. Each plaintext digit is sent through the assembly line and comes out on the other end encrypted. The bits are processed as a stream, hence the name.

The tricky part of stream ciphers is that the key has to match the size of the input. That is not truly practical for a large message. The trick used often is to provide a random key that is used to then create a pseudo-random keystream that can be applied to all of the digits.

A problem appears when a stream cipher is not implemented correctly, as the pseudo-randomness of the keystream can significantly weaken the effectiveness of the cipher.

The wide world of encryption

Encryption is certainly a large and expansive subject. The terms discussed here will give you a start but they are just the beginning.

Now you should be able to understand the basic building blocks of cryptographic algorithms. Do some more research if you need more details before moving on to the rest of the series on cryptography.

The next few posts will move into a higher-level space, describing specific algorithms using the terms outlined here. Up first will be symmetric encryption.

Enjoy the journey and protect your data with a better knowledge of cryptography.

Advertisements

3 thoughts on “The Developer’s Guide to Cryptography: The Basics

  1. Pingback: The Developer’s Guide to Encryption: Symmetric Encryption – Green Machine Security

  2. Pingback: The Developer’s Guide to Encryption: AES in Node.js – Green Machine Security

  3. Pingback: The Developer’s Guide to Cryptography: Hash Functions and HMACs – Green Machine Security

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s