|
This site will appear much more attractive if you ENABLE stylesheets! Netscape 4 users will need to enable JavaScript as well. Cryptology Home PageSecret decoder rings, pig Latin, writing with lemon juice, messages in code to a buddy, WW2 espionage films -- ah the fun of being young and having time to play. But concealing the meaning of a message from others can also be a very serious business. From medieval times when diplomats had to communicate with their rulers, through wartime when orders from headquarters had to be broadcast to the line officers to the present where corporate deals must remain secret until completed, confidentiality of messages remains a very high priority. This page is an exploration into the fundamentals of cryptology. It will not delve too deep into mathematical proofs but is more for those who wish to be aware of some of the basic techniques and strategies of code making and code breaking. Comments are always welcome. SteganographySteganography is the art of hiding the existence of a message rather than its meaning. It is currently a very HOT topic as one subdivision of steganography is digital watermarking, a technique used to copyright and to protect digital images, music and software. Although steganography is a very interesting topic in itself, the following short sections will be all that I say on it. Sorry! Low Tech Steganography
High Tech Steganography
CryptographyCryptography is the science of concealing a message's meaning rather than its existence. It can be subdivided into codes and ciphers. Codes are based on linguistic entities of variable character length such as syllables, words and phrases. Ciphers are based on fixed length elements without regard to meaning. The original message is known as plaintext and can be either encoded to codetext or enciphered to ciphertext. To read the message afterwards it must be either decoded or deciphered. Codes normally involve code books which are lists of words or phrases and their replacement codes. If the replacement codes are in the same alphabetic order so that the same list can be used for decoding, it is known as a one-part code. If a second list is needed to sort the codes alphabetically, then the encoding scheme is known as a two-part code. The difficulty of codes is that the code itself must be distributed to all readers and if the code book gets into insecure hands, the code is compromised and a new code needs to be distributed. An interesting sidenote to codes is the Navaho code talkers of WW2 fame. The use of a obscure foreign language delayed decryption until the message's purpose had been served. For a look at the code refer to Code Talk. Ciphers on the other hand rely on a system algorithm or procedure for encoding without need for a code book. Sometimes this system also requires a keyword for correct use. This allows a cipher to continue in use even when its system is known by others and even when a single keyword is known. Changing the keyword in current use once again conceals the meaning. Since the most interesting aspects of cryptology (at least for me) are the methods of enciphering and those of 'cracking' the messages, the rest of this page will emphasis cipher techniques rather than coding. CiphersCiphers are based on algorithms that transform the plaintext into ciphertext. These algorithms may also require the application of a keyword that introduces another level of security and flexibility to the overall system. All systems of ciphers can be classified as either transposition or substitution methods. Transposition methods involve moving characters to new positions based on an algorithm or procedure. For example, each pair of characters (known as digraphs) can be swapped so that 'an' becomes 'na'. Obviously, the algorithm can become much more involved if needed. Substitution methods use a mapping technique that replaces characters (or sometimes sets of characters) by other characters (or sets). If the mapping of a specific character does not change within the message, the scheme is known as a monoalphabet scheme. The mapping technique can be very simple such as replacing 'a' with 'z', 'b' with 'y' etc. (inversion) or 'a' with 'd' and 'b' with 'e' etc. (displacement or shift). But it can also be a complex mapping where both parties must know the scheme. With the introduction of teletypes and computers, characters can now be encoded using a binary technique. These binary representations can be manipulated with boolean operations to change the original character or encipher it. MonoalphabetsMonoalphabets retain the same mapping throughout the message. This leads to a relative ease of cryptanalysis or 'cracking' based on statistical analysis of the source language. However, studying the methods of encryption and analysis for monoalphabets will lead to a better understanding of the entire field of cryptology. Atbash is one of the oldest ciphers known. It even appears in the Hebrew Scriptures of the Bible. Basically any occurrence of the first letter of the alphabet is replaced by the last letter, occupancies of the second by the second to last etc. This is not very secure as a single test will indicate if it was used but it was sufficient enough when literacy was not widespread. Atbash is a specific example of the general technique called inversion. Caesar is also a very old cipher. Letters are simply replaced by letters three steps further down the alphabet. That is 'a' becomes 'd', 'b' becomes 'd' etc. In fact any size displacement is known as a Caesar. This can easily be checked for as there is a finite number of mappings that can exist. Caesar is a specific example of the general technique called displacement. Reciprocal alphabets are those where if 'a' maps to 'n' then 'n' maps to 'a' for all characters. Atbash is an example of a reciprocal alphabet. Folding is a technique where the alphabet is split at a certain point. For example a simple fold might occur at the midpoint of the alphabet. This would map 'a' to 'n', 'b' to 'o' etc. Note that this would also provide a reciprocal alphabet. The techniques of inversion, displacement, and folding could be intermixed to provide a more complex encipherment - decipherment technique. Unfortunately this complexity does not change the method of analysis of a monoalphabet cipher! Keyword ciphers added a level of security in that even if the algorithm or technique was known, deciphering still required a knowledge of a password or phrase. The most common form of use was to take the keyword (or phrase), discard any repetitions, and then add the missing characters to the end of the string. This would form the replacement alphabet. As a example if the key phrase was give me liberty or give me death then the replacement phrase is givemlbrtyodahcfjknpqsuwxz. 'a' is mapped to 'g', 'b' is mapped to 'i' etc. Passwords could be changed on a regular basis or even on a use once basis. Coupled with short texts to make frequency analysis difficult this is a fairly good enciphering technique as little skill is required to create the map on the fly. Polybius Checkerboard involves creating a 5x5 grid of letters (with i and j in the same cell). By numbering the columns and rows, any letter can be represented by a two digit number. For example the letter k would be column 5 row 2. Playfair is a cipher that modifies the Polybius checkerboard by introducing a key. The key phrase is used to fill in the cells of the table, but any letter already used is dropped. Once the key phrase has been used up, the remaining letters of the alphabet are inserted in strict alpha order till all cells are complete. Once again i and j are assumed equivalent for building the table. To encipher plaintext one uses digraphs (ie. two characters at a time). PolyalphabetsPolyalphabet algorithms regularly change the mapping schemes for characters (ie. 'a' does not always map to the same letter). This makes frequency count analysis more difficult as the word 'the' will not reoccur as the same sequence. But techniques do exist to spot any periodicy in alphabet reuse and hence a clue as to keyword length. CryptanalysisOne of the strongest tools of the cryptanalyst is frequency counts. Using letter and word occurrences in natural language and frequency counts in the ciphertext under analysis, clues as to the letter mappings can be tested in a logical order. You may wish to refer to the suite of programs I have developed for this purpose. Encrypting ProceduresA cryptanalyst uses regularity and predictability as part of his analysis technique. To counter this, a encrypter normally uses certain procedures to counter this predictability.
ReferencesEveryone needs references to build his knowledge from and the more sources the better. In the case of cryptology some references are more historical in nature, tracing the development of the science while others are routed in the mathematical basis of creating or analyzing an algorithm. For now I rely on printed media but would like to gather pointers to videos on the topic as well. Library Bookshelf
Useful Web Sites
JR's Home Page | Top of Page | Comments | [cyhome.htm:2003 01 01] |