Software Architecture: Message Digests and Keys

A message digest is analogous to the hand signatures in the real world. Digests are a convenient and useful way of authenticating messages.

Web-o-pedia defines message digest as:

The representation of text in the form of a single string of digits, created using a formula called a one-way hash function. Encrypting a message digest with a private key creates a digital signature, which is an electronic means of authentication (p.1)

A message in its entirety is taken as input and a small fingerprint created, this message along with its unique fingerprint is sent with the document. When the recipient is able to verify the fingerprint of the document it ensures that the message did not change during transmission. A message may be sent in plain text along with a message digest in the same transmission. The idea is that the recipient would be able to verify that the plain text was not transmitted unaltered by examining the digital signature. The most popular algorithm for message digests is the MD5 (IrnisNet.com, n.d.). Created at Massachusetts Institute of Technology, it was published to public domain as Internet RFC 1321.

MD5

The MD5, developed by Dr. Roland R. Rivest, is an algorithm that takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input (Abzug, 1991). While not mathematically proven, it is conjectured that it is not feasible to create a message from the digest. In other words, it is computationally infeasible to “produce any message having a given pre-specified target message digest” (Abzug, 1991).

MD5 is described in the request for comment (rfc) 1321. Rivest (1992) summarized MD5 as:

The MD5 algorithm is an extension of the MD4 message-digest algorithm. MD5 is slightly slower than MD4, but is more "conservative" in design. MD5 was designed because it was felt that MD4 was perhaps being adopted for use more quickly than justified by the existing critical review; because MD4 was designed to be exceptionally fast, it is "at the edge" in terms of risking successful cryptanalytic attack. MD5 backs off a bit, giving up a little in speed for a much greater likelihood of ultimate security. (p.3)

Message digest 5 is an enhancement over MD4 – Rivest (1992) describes this version as more conservative as its predecessors and easier to codify the algorithm compactly. The algorithm provides a fingerprint of a message of any length. In order to come up with two messages (plain text) resolving to the exact same fingerprint is of the order 2 to the power of 64 operations. To reverse-engineer a fingerprint with a matching plain text message required 2 to the power of 128 operations. Such great numbers provide current computational infeasibility.

SHA-1

The Secure Hash Algorithm 1(SHA-1) algorithm is an advanced algorithm adopted by the United States of America as a Federal Information Processing Standard. SHA-1, as explained in the RFC 3174, is employed for computing a condensed representation of a message or a data file (Jones, 2001). This algorithm can accept a message of any length (theoretically less than 2 to the power of 64 bits); the output is a 160-bit message digest that is computationally unique to the input given. This signature can be used for validation against the previous signature.

Demonstration. For example, if the user registers with a password “purdue1234” the SHA-1 algorithm can be applied which will result in a 160-bit “8ad4d7e66116219c5407db13280de7b4c2121e23”. This digest can be saved in the database instead of the plain text password the user registers with. The next time the user signs on with the same plain-text password – it will get converted to the same signature which can then be compared to authenticate the user. If the user enters a different password say “rohit1234” the SHA-1 digests it as “fb0f57cb70fbd8926f2912585854cbe4bcf83942”. This triggers a mismatch and the authentication fails. The algorithm guarantees to generate the same 160-bit signature given the plain-text, and it is computationally infeasible to reverse the digest into the plain-text. Therefore even if the database is “hacked” the passwords will not be usable. This is one of the most common techniques employed in the industry for saving sensitive data that only needs to be verified and not reused.
DSA

Digital Signature Algorithm (DSA) is an algorithm inherited from the National Security Agency (NSA) and published by the National Institute of Standards and Technology (NIST) in the Digital Signature Standard (DSS) as part of the United States government’s Capstone project (RSA Laboratories, n.d.). In order to gain a better understanding of DSA, the discrete logarithm problem needs to be explained. RSA Laboratories documentation explains that for a group element g, if g is multiplied by itself n times, it is represented by gⁿ; the discrete logarithm problem is as follows: take two group elements g and h which belong to a finite group G, find an integer x such that g^x=h. The discrete logarithm problem is a complex one, it is considered more complex and a harder one-way function than those algorithms that are based on the factoring problem.

Algorithm implementations that have emerged are quick with a big of “O(O(n))”. The big-O notation is a theoretical measure of the execution of an algorithm usually the time or memory needed given the problem size n, which is usually the number of item (NIST, 1976). Signature verification is faster than signature verification, whereas with the RSA algorithm the verification is much faster than the generation of the actual digest itself (RSA Laboratories, n.d.). Initial criticism of the algorithm surrounded around the lack of flexibility when compared with the RSA cryptosystem, the verification performance, adoption issues as cited by hardware and software vendors that had standardized on RSA, and finally the discretionary selection of the algorithm by NSA (RSA Laboratories, n.d.). DSA has now been incorporated by several specifications and implementations. This can now be considered a good choice for adoption by the enterprise.

Secret Keys

Two general types of cryptosystems have evolved over the decades: secret-key cryptography and public-key cryptography. In secret-key cryptography, as the name suggests, a key is maintained and kept secretive from the public domain, only the recipient and the sender have knowledge of the key. This is also known as symmetric key cryptography. In a public-key cryptography system, two keys play a role in ensuring security. The public key is well published or can be requested, the private key is kept secret by the individual parties. This scheme requires a Certificate Authority such that tampering of public keys is prevented. The primary advantage of this scheme over the other is that no secure courier is needed to transfer the secret key. The main disadvantage is that broadcasting of encrypted messages is not possible.

Symmetric Keys

This scheme is characterized by the use of one single key that can encrypt and decrypt the plain text message. The encryption and decryption algorithms now exist in the public domain, the only way this scheme can be used is by the knowledge of a key. If the key is known only to the parties that are in a secured communication mode, secrecy can be provided (Barkley, 1994). When symmetric key cryptography is used for communications and the messages are intercepted by a hacker, it is computationally infeasible to derive the key or decrypt the message from the cipher even if the encryption algorithm is known. The cipher can only be decrypted if the secret key is known. Because the secret key is known only by the message sender and the message receiver, the secrecy of the transmission can be guaranteed.

MAC. While secrecy can be guaranteed the integrity of the message cannot be guaranteed. In order to ensure that the message has integrity, a cryptographic checksum called the Message Authentication Code (MAC) is appended to the message. A MAC is a type of message digest, it is smaller than the original message, a MAC cannot be reverse engineered, and colliding messages are hard to find. The MAC is computed as a function of the message being transmitted and the secret key (Barkley, 1994). This is done by the message originator or the sender.

Asymmetric Keys

Asymmetric key cryptography is different in the sense that there is only one key that is well known to both parties and another set of keys that is private. This scheme is also known as public-key cryptography. The public key is used to generate a function that transforms text (Barkley, 1994). The private key is secret and is known only to the parties who own their respective public keys. The public keys are meant to be distributed. Both the keys are part of a pair and either one can be deemed public and the other private. Each key generates a transformation function, because the public key is known its transformation can be derived and be made known also. In addition, the functions have an inverse relationship. If one function encrypts a message the other can be used to decrypt it (Barkley, 1994). How these transformation functions are used is as follows: the public key of the destination is requested, the sender uses the public key of the destination and transforms the data to be transmitted using it. The sender then transmits the encrypted data to the desired sender. Note that the transmission of the data is encrypted and can only be decrypted by the other pair of the public key that was used. The private key of the receiver can decrypt the message. The receiver uses the private key after receiving the encrypted message and then uses it to decrypt the message, after which the message can be consumed.

The advantage of such a scheme is that two users can communicate with each other without having to share a common key; usually with symmetric key cryptography a common key is saved. The common key which is usually a secret key is not something that should be shared in the first place. Also, distribution of secret keys adds to the layer of complexity associated with the security of the system. Using public-key cryptography this issue is easily resolved. Because it is computationally infeasible for the private key to be derived from the public key, it is also, therefore, infeasible to decrypt the message encrypted with the public key. While there is convenience there is an issue with the inefficiency of the mechanism. The time taken to complete the encryption of plain text can take a long time; also the length of the cipher text can be longer than the plain text message itself. Also, distribution of messages is not possible because the private key is held by only one principal. Therefore it is not possible to use this scheme for encrypted broadcasts. Applications for public-key cryptography are often seen in the enterprise: authentication, integrity and non-repudiation.

Sunday, January 20, 2013

Message Digests and Keys

MD5

SHA-1

Secret Keys

Symmetric Keys

Asymmetric Keys