The Caesar Shift Cipher assumes your message is all capital letters, and replaces each letter in the plaintext with a new letter to produce the ciphertext. The replacement scheme is based on secret key that Alice and Bob have agreed upon ahead of time — a number in the range 1-25 called the shift value. The replacement scheme is simple: if the shift value is s, the kth letter in the alphabet is replaced by letter k+s in the alphabet, circling back around to the front of the alphabet if necessary. So with a shift value of 3, the letter B (the 2nd letter in the alphabet) is replaced with the letter E (the letter number 2+3 = 5 in the alphabet). You can use the little applet below to help encrypt a message once you've chosen a shift value.
Decrypting is means subtracting rather than adding the shift value, although you might notice that a shift value of 26-s actually reverses a shift of s. Let's follow this process through from start to finish:
Although very simple and, as we'll see, not very secure, the Caesar Cipher is a good example. It has the basic properties of any cryptosystem: two communicating parties Alice/Bob, nefarious eavesdropper Eve, plaintex/ciphertext, encryption/decryption. Moreover, it's representative of one of the two basic classes of cryptosystem, symmetric encryption (also called secret-key), where there is a secret key, shared by both Alice and Bob, that is used to encrypt and decrypt the message.
Sound like an important job? It is and you might be the one doing it at your command as a junior officer. Every Naval or Marine unit that uses secure communications has at least two EKMS managers and it is common practice to have a junior officer act as one of them.
Let's suppose you are Eve, and you've intercepted the message (ciphertext) XPPE XP LE YZZY. There are more P's than anything else, so you might guess (correctly in this case) that a P in the ciphertext came from an E in the plaintext. This would lead you to guess the key/shift-value k = 11.
It's not always going to be that easy of course. The ciphertext RNCP KU QHH has more H's than anything else. If we assume that H's in the ciphertext came from E's in the plaintext, we would deduce a key/shift-value of 3. Decrypting assuming k = 3 gives OKZM HR NEE ... which is probably not the secret message. In fact, the plaintext that produced this message was PLAN IS OFF.
The problem with this approach is that that we only considered one letter — the most common appearing in the ciphertext. Assuming H's came from E's gave us lots of E's in our "cracked" message, but it also gave us Z's and K's, which are pretty uncommon. To do frequency analysis properly, we should consider all the letters in the message. This is tedious, of course, but when something is tedious, it just means that we ought to write a program and let the computer do it for us. Try out this page which features a Javascript program for cracking Caesar shift encryption via frequency analysis. It functions by calculating for each shift value the likelihood of that shift value being correct based on the frequencies of the letters that result from decrypting the given ciphertext with that shift value. It's very interesting to see how few characters of ciphertext are required to recover the key with a high degree of certainty.
So we see that the Caesar Shift Cipher is not very secure. In particular, it's quite vulnerable to attack via frequency analysis. Its problems are a) there are only 26 key values, so trying them all is a viable option, and b) since a given character in the plaintext is always replaced with the same character in the ciphertext, letter frequencies carry over from plaintext to ciphertext.
. | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
A | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
B | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A |
C | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B |
D | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C |
E | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D |
F | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E |
G | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F |
H | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G |
I | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H |
J | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I |
K | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J |
L | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K |
M | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L |
N | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M |
O | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N |
P | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
Q | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P |
R | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q |
S | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R |
T | T | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S |
U | U | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T |
V | V | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U |
W | W | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V |
X | X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W |
Y | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X |
Z | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y |
The key is a string of letters like JOE. To encrypt, you take your plaintext (we'll reuse MEET ME AT NOON) and write it down. Then you write down th key string over the plaintext, with letters matching up. If the plaintext is longer than the key, you simply repeat the key. Like this:
JOEJ OE JO EJOE ← key (repeated as needed) MEET ME AT NOON ← plaintextNext you write down (or have on hand) the table you see on the right. The encrypted value a plaintext character is the table entry whose row is given by the plaintext character and whose column is given by the key character written above it. Thus, the first letter of our message encrypts to the table entry at row M and column J, which is a V. (Check out this demo to see the process on a short message.) Decryption is straightforward if you understand encryption: write down the ciphertext with the key written above it (repeated as needed). To decrypt a character in the ciphertext, identify the column given by the key character above the ciphertext character. Find the ciphertext character in that column. The row at which it appears is labeled with decrypt value, i.e. with the corresponding plaintext character.
Think about how the Vigenere Cipher addresses the flaws in the Caesar Shift. The key is a string of characters, and since there are roughly 6 trillion strings of length less than 10, for instance, the problem of too few keys has been addressed. The same letter at different positions in the plaintext generally does not get mapped to the same character in the ciphertext, since the key-character written above plays a role in the encryption. So letter frequencies in the plaintext do not get carried over to the ciphertext.
Finding the key length can be a problem, but one easy way given what we already know is this: for each possible key length n, form the string consisting of every nth character starting from the first, give that as a ciphertext input to our Caesar Shift Frequency Analysis page, and make a note of the probability of the shift index it gave you for that n. Whichever n value gave us the highest score is probably the actual length of the key. In class, we will actually have performed this exercise.
This kind of attack requires enough text that our Caesar Shift frequency analysis of every nth character finds the proper shift index with high probability. If the message length is L, and we assume we need about 20 characters to be assured of having a high probability with our Caesar Shift frequency analysis, we'd like to have L/n > 20. If L is short or n is long, our attack will fail. So, in general, a longer key gives you more security from frequency analysis. If you have a key that is a completely random sequence of letters, and which is as long or longer than the message, the Vigenere Cipher is unbreakable — provided you never use the key again. In this situation, the system becomes what is called a one-time pad. The problem with such a system is that arranging to have this one huge key is difficult.
The program that cracked WEP-encryption in your wireless lab is actually also based on a chosen plaintext attack.
JZFDEYNFUDS MB KLNFI CVIH KMUZ ECHELYWe'll assume that "Albuquerque" is in fact the first word, so we have
??????????? ALBUQUERQUE ↓↓↓↓↓↓↓↓↓↓↓ JZFDEYNFUDSWe work through to recover the key like this: row A has a J in the J-column, so J is the first letter of the key. Row L has a Z in the O-column, so we have an O as the second letter, and so on. In this way we recover JOEJOEJOEJO and deduce that the key was JOE.