To us, a string such as "
Hello world" looks like a series of letters with a space in the middle. To your computer, however, every String – in fact, everything – is a series of numbers.
In our example, each character of the String "
Hello world" is represented by a number between 0 and 127. For example, to the computer, the capital letter "
H" is encoded as the number 72, whereas the space is encoded as the number 32. The ASCII standard, originally developed for sending telegraphs, specifies what number is used to represent each character.
On most Unix-like operating systems, you can view the entire chart of ASCII codes by typing "
man ascii" at the shell prompt. Wikipedia's page on ASCII also lists the ASCII codes. Using an ASCII chart, we discover that our string "
Hello world" gets converted into the following series of ASCII codes.
H e l l o space w o r l d 72 101 108 108 111 32 119 111 114 108 100
You can also determine the ASCII code of a character by using the
? operator in Ruby.
puts ?H puts ?e puts ?l puts ?l puts ?o
The question-mark syntax no longer works in Ruby 1.9. Instead, use the ord method.
puts "H".ord puts "e".ord puts "l".ord puts "l".ord puts "o".ord
Notice that the output (below) of this program matches the ASCII codes for the "
Hello" part of "
$ hello-ascii.rb 72 101 108 108 111
To get the ASCII value for a space, we need to use its escape sequence. In fact, we can use any escape sequence with the
puts ?\s puts ?\t puts ?\b puts ?\a
32 9 8 7
You may not realize it, but so far, you've been running your Ruby programs inside of a program called a terminal emulator – such as the Microsoft Windows console, the Mac OS X Terminal application, a telnet client, rxvt, or X Window System programs such as xterm.
When your Ruby program prints out the letter "
H", it sends the ASCII code for "
H" (72) to the terminal emulator, which then draws an "
H". When your Ruby program prints out a bell character, it sends a different ASCII code – ASCII code 7 – to the terminal emulator. In this case, the terminal emulator does not draw a symbol, but instead will typically beep or flash briefly. How each of the codes gets interpreted is largely determined by the ASCII standard.
Other character encodings
The ASCII standard is a type of character encoding. As mentioned above, ASCII only uses numbers 0 through 127 to define characters. There's a lot more characters than that in the world. Other character encoding systems – such as Latin-1, Shift_JIS, and the Unicode Transformation Format (UTF) – have been created to represent a wider variety of characters, including those found in languages such as Arabic, Hebrew, Chinese, and Japanese.