Ruby Programming/Reference/Objects/Encoding

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Encoding is basically a new concept introduced in Ruby 1.9

Strings now have "some bytes" and "some encoding" associated with them.

In 1.8, all strings were just "some bytes" (so basically treated as what BINARY/ASCII-8BIT encoding is in 1.9). You had to use helper libraries to use any m18n style stuff.

By default, when you open a file and read from it, it will read as strings with an encoding set to

Encoding.default_external (which you can change).

This also means that it double checks the strings "a bit" for correctness on their way in.

In windows, it also has to do conversion from "\r\n" to "\n" which means that when you read a file in non-binary mode in windows, it has to first analyze the incoming string for correctness, then (second pass) convert its line endings, so it is a bit slower.

Recommend 1.9.2 for windows users loading large files, as it isn't quite as slow. Or read them as binary (File.binread or'name', 'rb').

[edit | edit source]

here is one good tutorial. here is another. [1] is another.