
A. ASCII is the nearest thing to a standard for the exchange of English text that we have. Your word processor may have a different name for ASCII such as "plain text."
As you may know, computers do not really work with words or letters. Computers manipulate numbers. What we see as pictures and the pictures that we take to be letters are all numbers to the computer. The most convenient group of numbers for most computers to deal with is a byte. A byte is eight binary digits, which is the same thing as two hexadecimal digits, and which can be represented in decimal numbers by the integers 0 through 255.
Early in the history of computing people saw that it would be desirable for various computers to have some way of exchanging text information. For various technical reasons some manufacturers were only using seven of the available eight binary digits in a byte for storing or transmitting text. That allowed only 128 codes (generally numbered, in computer fashion starting with zero, 0 through 127). Fortunately, general agreement was reached on what each of those codes should represent when interpreted as text.
Codes 0 through 31 were used for various control characters, many of which have names reminiscent of Teletypes; what we would now call "beep" was then called "bell." A few of these control characters are still of importance today: carriage return, line feed, tab, and a few others. Code 32 was a space and codes 33-47 were various simple punctuation marks and math symbols all found on typewriter keyboard: ! " # $ % & ' ( ) * + , -- . /. Codes 48 through 57 were the numbers 0 through 9. Codes 58 through 64 were more common symbols : ; < = > ? @. Codes 65 through 90 were the capital letters A through Z. Codes 91 through 97 were more symbols: [ \ ] ^ _ `. Codes 97 through 122 were the lowercase letters a through z. And the remaining codes 123 through 127 were some more symbols: { | } ~ and a blank which is a blank which looks like a space but is treated differently because it has a different code.
That is ASCII: the association of those letters and symbols with the numbers 0 through 127. Actually it makes a little more sense in binary than in decimal numbers. For example, in binary the code for A is 0100 0001 and the code for a is 0110 0001. The difference ( 0010 0000 ) is the difference between every capital letter and its lowercase version. If you wanted to ignore case (wanted to ignore the difference between capital and lowercase letters) you only had to ignore the third binary digit from the left. Also the code for zero as might be printed in text is (0011 0000), and the code for one is (0011 0001). To change the code for zero to the number 0, a number that can be used for calculations, all you have to do is ignore the first four binary digits, which amounts to subtracting decimal 48.
Unfortunately, ASCII was about the last thing computer makers agreed upon. Various manufactures and software developers assigned the codes above 127, when they became available, to a variety of different functions and characters. IBM used them mostly for various box-drawing characters. Perhaps you have noticed USENET articles with very strange looking characters, or perhaps you have seen web pages in other languages, but have seen that some of the characters just don't look right. The letters which occur in English may seem okay, but accented characters may show up as math symbols or box-drawing characters. These are the results of there not being an agreement on what the codes above 127 should represent--and also of there being many more than 256 characters in the languages of the world, so there could not be an agreement that would please everyone. You are also very likely to see garbage where apostrophes or quotation marks should be. ASCII had only one code (39) for an apostrophe, a right single quote, and a left single quote, and has only one code (34) for right double quote, left double quote, and the ditto mark. Some word processors have "smart quote" which have various different marks and weird codes for these various functions, and some people forget to turn this feature off when they prepare material for USENET or the Web. It looks fine to them because their computer associates those codes with the character the users expect. But it looks like garbage to anyone who has a different kind of computer.
While ASCII is fine for coding many kinds of text, it simply was not adequate for modern word processors.
There is no ASCII code for "use Roman 12 point here" or "put this in boldface" or "print this page landscape (sideways)." Yet word processors include this kind of information in their files--and it all has to be coded in numbers, because computers can only deal with numbers. This would not be so bad, but every brand of word processor has a different way of coding such information, and very often the codes will differ between versions of the brand of software or the same brand of software used on different machines.
Now there is good news and bad news. The good news is that in most word processors, your text, the actual letters and punctuation, probably are coded in ASCII, and what is more, that is the part an editor will want--because your choice of fonts, margins, and printers, and so forth are all immaterial to your editor. The bad news is, that the editor may not be able to dig the ASCII coded stuff he or she needs from your word processing file. If you saved your file in Word 7.0 for Windows '95 and your editor is using WordPerfect 6.0 for DOS, unless your editor is a major computer wizard, he or she won't be able to dig your text out of the file you sent. And it will be as bad or worse if you have sent your file as an attachment to e-mail. A word processing file has to be encoded again to be attached to e-mail and you might encode it in with UUENCODE, binhex, base64 or by some other system. Moreover you may not even know how your file was encoded, and your editor may not have the appropriate decoder.
The solution, of course, is to strip all of the word processor codes out of the file so that only the ASCII -- the letters and punctuation -- remain. This is what "save as ASCII" (or "plain text" or whatever your word processor calls it) is for. If you are not absolutely sure than an editor can read your file, you should sent an ASCII file. You can send a story in ASCII as e-mail without having to "file attach" it. It won't have to be encoded in any special way.
Skip to: Top or page information.
Donate by Mail!
Lars EighnerDonate by PayPal!
Donations are not tax deductible and do not buy access, products, or services.
Skip to: Top or Main Menu.
Use the following links to continue the Workshop Guided Tour. This will abandon any excursion tours shown below.
Use the following links to continue the Writing FAQ Guided Tour. This will abandon any excursion tours shown below.