American Standard Code for Information Interchange

views updated

American Standard Code for Information Interchange

The American Standard Code for Information Interchange (ASCII; pronounced “askee”) was first introduced in 1968 as a method of encoding alphabetic and numeric data in digital format. The American Standards Association (which is now called the American National Standards Institute [ANSI]) had previously published it as a standard in 1963. Although ASCII code was originally developed for the teletypewriter industry, it has since found widespread use in computer and information-transfer technologies. ASCII is updated periodically. It defines codes for 33 nonprint control characters, which are generally not used to a great degree. ASCII also defines the following frequently used 95 print characters (including the space character):

“!” #$%&’()*+, –./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]°_, abcdefghij klmnopqrstuvwxyz{ǀ}~

Because ASCII code is standardized, computers, other electronic devices, and computer programs can use it to exchange data with each other. This is true even of computers that use different operating systems; for example, personal computers (PCs) and Macintoshes (Macs).

As originally formulated, each ASCII-encoded representation consisted of a string of seven digits, where each digit was either a 0 or a 1 (i.e., binary code). There were as a result 128 possible ways of arranging these 0s and 1s. In this representation, each alphanumeric character was uniquely assigned a number between 0 and 127, which was represented by its binary equivalent in a string of seven 0s and 1s. The ASCII notation for the capital letter A, for example, is the binary code representation (1000001) for the base-10 number 65; similarly, a blank space has the binary code for the base-10 number 32.

Inside computers, each English alphabet character is represented by a string of eight 0s and 1s. Each of the digits in the string is known as a bit, and a series of eight bits is known as a byte. Because ASCII code as originally formulated constituted only 7 bits, when 7-bit ASCII code was embedded in the eight-bit computer code, there was one bit left over.

At one time, this extra bit was used primarily for the purpose of checking errors in data transmission. However, today, computers use this extra bit to encode an additional 128 characters for the purpose of representing special symbols. Note that with eight-bit encoding, the number of possible arrangements of 0s and 1s increases from 128 to 256.

As an example of the eight-bit encapsulation of seven-bit ASCII code, note that the eight-bit representation for the letter “A” (01000001) simply places a 0 in the eighth-bit position relative to the seven-bit representation (1000001). Seven-bit ASCII still has some advantages, however, as it is recognized by all computers, including PCs, Macs, UNIX or VMS mainframes, printers, and any other computer-related equipment.

Eight-bit ASCII code is known as extended ASCII code. This representation was introduced by International Business Machines Corporation (IBM) in 1981 for use in its first personal computer. Extended ASCII code quickly became a standard in the personal computer industry. Unlike the original seven-bit ASCII code, the extended code uses 32 of its 256 character representations to encode nonprinting commands such as form feed.

Another 32 character representations are reserved for numbers and punctuation marks. Thirty-two more representations are for uppercase letters and additional punctuation marks. The last 32 representations are reserved for lowercase letters. Note that the upper-and lowercase letters have distinct representations differing by 32.

In languages other than English, where there are much larger character sets, for example Chinese and Japanese, a single byte (eight bits) is not sufficient for representing all the characters in the language. However, by representing each character by two bytes (16 bits), it is possible to assign a unique number code to each character.

In the United States, most computers require slightly modified operating systems to be able to handle two bytes at a time, as well as special reference tables to display the characters. It is therefore necessary to change operating systems before one can run Japanese or Chinese software. But U.S. software applications will run without problems on computers in Japan and China equipped with operating systems that recognize two-byte characters.

The term ASCII is sometimes used imprecisely to refer to a type of text computer document. A file that contains ASCII text (also known as plain text) is one that does not contain any special embedded control characters. This encoding system not only lets a computer store a document as a series of numbers, but also lets the computer share the document with other computers that use the ASCII system.