This site will appear much more attractive if you ENABLE stylesheets! Netscape 4 users will need to enable JavaScript as well.

Cryptology Home Page

Secret decoder rings, pig Latin, writing with lemon juice, messages in code to a buddy, WW2 espionage films -- ah the fun of being young and having time to play.

But concealing the meaning of a message from others can also be a very serious business. From medieval times when diplomats had to communicate with their rulers, through wartime when orders from headquarters had to be broadcast to the line officers to the present where corporate deals must remain secret until completed, confidentiality of messages remains a very high priority.

This page is an exploration into the fundamentals of cryptology. It will not delve too deep into mathematical proofs but is more for those who wish to be aware of some of the basic techniques and strategies of code making and code breaking. Comments are always welcome.

Steganography

Steganography is the art of hiding the existence of a message rather than its meaning. It is currently a very HOT topic as one subdivision of steganography is digital watermarking, a technique used to copyright and to protect digital images, music and software. Although steganography is a very interesting topic in itself, the following short sections will be all that I say on it. Sorry!

Low Tech Steganography

  • A scytale (rhymes with Italy) was a cloth wound around a staff or baton with a message written vertically on the cloth. When unwound, the characters appeared as random decorations. A servant wearing the cloth as a sash or belt would then deliver the message as required. The receiver of the message would read it by wrapping the cloth around a staff of similar diameter.
  • Another ancient form of message hiding was to carve the message into the base of a wax slate, then prepare the slate as usual. An apparently blank slate had a message if you knew enough to remove the wax. A variation on this is writing on a canvas, then painting a scene or portrait over it. A more extreme example was shaving a slave's head and tattooing a message into his scalp. Once the hair grew back the slave was sent on his not time dependent mission.
  • In medieval times simple embossing such as needle pricks under letters indicated characters to be used to create the plaintext message.
  • Another low tech solution involved writing an 'innocent' message with the 'real' message hidden at specific locations such as the first letter of each sentence.
  • Grills were cutout templates that revealed the characters to be used in reconstructing a 'hidden' message.

High Tech Steganography

  • Invisible inks are examples of developing sciences. A simple example is lemon juice but there are others that react to a single reagent.
  • Microdot photographic techniques exist where a simple period at the end of the sentence often contained more meaning than the sentence itself.

Cryptography

Cryptography is the science of concealing a message's meaning rather than its existence. It can be subdivided into codes and ciphers. Codes are based on linguistic entities of variable character length such as syllables, words and phrases. Ciphers are based on fixed length elements without regard to meaning.

The original message is known as plaintext and can be either encoded to codetext or enciphered to ciphertext. To read the message afterwards it must be either decoded or deciphered.

Codes normally involve code books which are lists of words or phrases and their replacement codes. If the replacement codes are in the same alphabetic order so that the same list can be used for decoding, it is known as a one-part code. If a second list is needed to sort the codes alphabetically, then the encoding scheme is known as a two-part code. The difficulty of codes is that the code itself must be distributed to all readers and if the code book gets into insecure hands, the code is compromised and a new code needs to be distributed.

An interesting sidenote to codes is the Navaho code talkers of WW2 fame. The use of a obscure foreign language delayed decryption until the message's purpose had been served. For a look at the code refer to Code Talk.

Ciphers on the other hand rely on a system algorithm or procedure for encoding without need for a code book. Sometimes this system also requires a keyword for correct use. This allows a cipher to continue in use even when its system is known by others and even when a single keyword is known. Changing the keyword in current use once again conceals the meaning.

Since the most interesting aspects of cryptology (at least for me) are the methods of enciphering and those of 'cracking' the messages, the rest of this page will emphasis cipher techniques rather than coding.

Ciphers

Ciphers are based on algorithms that transform the plaintext into ciphertext. These algorithms may also require the application of a keyword that introduces another level of security and flexibility to the overall system. All systems of ciphers can be classified as either transposition or substitution methods.

Transposition methods involve moving characters to new positions based on an algorithm or procedure. For example, each pair of characters (known as digraphs) can be swapped so that 'an' becomes 'na'. Obviously, the algorithm can become much more involved if needed.

Substitution methods use a mapping technique that replaces characters (or sometimes sets of characters) by other characters (or sets). If the mapping of a specific character does not change within the message, the scheme is known as a monoalphabet scheme.

The mapping technique can be very simple such as replacing 'a' with 'z', 'b' with 'y' etc. (inversion) or 'a' with 'd' and 'b' with 'e' etc. (displacement or shift). But it can also be a complex mapping where both parties must know the scheme. With the introduction of teletypes and computers, characters can now be encoded using a binary technique. These binary representations can be manipulated with boolean operations to change the original character or encipher it.

Monoalphabets

Monoalphabets retain the same mapping throughout the message. This leads to a relative ease of cryptanalysis or 'cracking' based on statistical analysis of the source language. However, studying the methods of encryption and analysis for monoalphabets will lead to a better understanding of the entire field of cryptology.

Atbash is one of the oldest ciphers known. It even appears in the Hebrew Scriptures of the Bible. Basically any occurrence of the first letter of the alphabet is replaced by the last letter, occupancies of the second by the second to last etc. This is not very secure as a single test will indicate if it was used but it was sufficient enough when literacy was not widespread. Atbash is a specific example of the general technique called inversion.

Caesar is also a very old cipher. Letters are simply replaced by letters three steps further down the alphabet. That is 'a' becomes 'd', 'b' becomes 'd' etc. In fact any size displacement is known as a Caesar. This can easily be checked for as there is a finite number of mappings that can exist. Caesar is a specific example of the general technique called displacement.

Reciprocal alphabets are those where if 'a' maps to 'n' then 'n' maps to 'a' for all characters. Atbash is an example of a reciprocal alphabet.

Folding is a technique where the alphabet is split at a certain point. For example a simple fold might occur at the midpoint of the alphabet. This would map 'a' to 'n', 'b' to 'o' etc. Note that this would also provide a reciprocal alphabet.

The techniques of inversion, displacement, and folding could be intermixed to provide a more complex encipherment - decipherment technique. Unfortunately this complexity does not change the method of analysis of a monoalphabet cipher!

Keyword ciphers added a level of security in that even if the algorithm or technique was known, deciphering still required a knowledge of a password or phrase. The most common form of use was to take the keyword (or phrase), discard any repetitions, and then add the missing characters to the end of the string. This would form the replacement alphabet. As a example if the key phrase was give me liberty or give me death then the replacement phrase is givemlbrtyodahcfjknpqsuwxz. 'a' is mapped to 'g', 'b' is mapped to 'i' etc. Passwords could be changed on a regular basis or even on a use once basis. Coupled with short texts to make frequency analysis difficult this is a fairly good enciphering technique as little skill is required to create the map on the fly.

Polybius Checkerboard involves creating a 5x5 grid of letters (with i and j in the same cell). By numbering the columns and rows, any letter can be represented by a two digit number. For example the letter k would be column 5 row 2.

Playfair is a cipher that modifies the Polybius checkerboard by introducing a key. The key phrase is used to fill in the cells of the table, but any letter already used is dropped. Once the key phrase has been used up, the remaining letters of the alphabet are inserted in strict alpha order till all cells are complete. Once again i and j are assumed equivalent for building the table. To encipher plaintext one uses digraphs (ie. two characters at a time).

Polyalphabets

Polyalphabet algorithms regularly change the mapping schemes for characters (ie. 'a' does not always map to the same letter). This makes frequency count analysis more difficult as the word 'the' will not reoccur as the same sequence. But techniques do exist to spot any periodicy in alphabet reuse and hence a clue as to keyword length.

Cryptanalysis

One of the strongest tools of the cryptanalyst is frequency counts. Using letter and word occurrences in natural language and frequency counts in the ciphertext under analysis, clues as to the letter mappings can be tested in a logical order. You may wish to refer to the suite of programs I have developed for this purpose.

Encrypting Procedures

A cryptanalyst uses regularity and predictability as part of his analysis technique. To counter this, a encrypter normally uses certain procedures to counter this predictability.

Short Messages
The longer the text the more chance of patterns occurring and patterns are what cryptanalysts look for. Keep the plaintext as terse as possible to make decryption more difficult.
Case Sensitivity
Since uppercase letters may indicate sentence beginnings and/or proper nouns, all messages should be reduced to lowercase.
Word Frequency Analysis
Ciphertext messages should not be written in word format. One common method used primarily for radio transmission is to block text in five character sequences. Another technique which may confuse analysts for a short while is to write the cyphertext in random word like order. This is a delaying tactic but all codes are breakable and it is time that the encoder is buying.
Punctuation
Punctuation makes sentence endings and beginnings predictable and so elimination is mandatory. Embedding words like stop in the text is also foolish. The deciphered text should be readable and punctuation replaced manually if required.
Message Fills
If a message is blocked in x character format for transmission, fill characters should be random and not the predictable xxx string. And for variation, fills can be prepended to the text rather than appended.
Common Abbreviations
If repeated use of a common abbreviation is required, on random occasions it should be spelt out. Once again a high repetition can lead to breaking the cypher.
Common Words
Common words such as the, an, and of can often be omitted without loss of message sense.
Spelling
Variation of a word's spelling such as American/British forms and even homonyms or misspellings can be used to reduce repetition. Remember the key is to introduce randomness.
Standard Forms
Often opening and closing information is routine such as who the message is for and courtesy closings. By breaking the plaintext into sections and then intermixing them in a random fashion, there is a chance that the 'preamble' and 'postamble' may go undetected.

References

Everyone needs references to build his knowledge from and the more sources the better. In the case of cryptology some references are more historical in nature, tracing the development of the science while others are routed in the mathematical basis of creating or analyzing an algorithm. For now I rely on printed media but would like to gather pointers to videos on the topic as well.

Library Bookshelf

  • "Classical Cryptology Course" by Randall Nicholls
  • "Cryptanalysis" by Helen Fouche Gaines, Dover [1939]
  • "The CodeBreakers" by David Kahn, Signet [1967]
  • "Codes and Ciphers" by Peter Way, Aldus Books UK [1977]
  • "Top Secret Data Encryption Techniques" by Gilbert Held, Sams [1993]

Useful Web Sites


JR's Home Page | Top of Page | Comments | [cyhome.htm:2003 01 01]