Jump to content

Non-Programmer's Tutorial for Python 2.6/Revenge of the Strings

From Wikibooks, open books for an open world

And now presenting a cool trick that can be done with strings:

def shout(string):
    for character in string:
        print "Gimme a " + character
        print "'" + character + "'"
shout("Lose")

def middle(string):
    print "The middle character is:", string[len(string) / 2]

middle("abcdefg")
middle("The Python Programming Language")
middle("Atlanta")

And the output is:

Gimme a L
'L'
Gimme a o
'o'
Gimme a s
's'
Gimme a e
'e'
The middle character is: d
The middle character is: r
The middle character is: a

What these programs demonstrate is that strings are similar to lists in several ways. The shout() function shows that for loops can be used with strings just as they can be used with lists. The middle procedure shows that that strings can also use the len() function and array indexes and slices. Most list features work on strings as well.

The next feature demonstrates some string specific features:

def to_upper(string):
    ## Converts a string to upper case
    upper_case = ""
    for character in string:
        if 'a' <= character <= 'z':
            location = ord(character) - ord('a')
            new_ascii = location + ord('A')
            character = chr(new_ascii)
        upper_case = upper_case + character
    return upper_case

print to_upper("This is Text")

with the output being:

THIS IS TEXT

This works because the computer represents the characters of a string as numbers from 0 to 255. Python has a function called ord() (short for ordinal) that returns a character as a number. There is also a corresponding function called chr() that converts a number into a character. With this in mind the program should start to be clear. The first detail is the line: if 'a' <= character <= 'z': which checks to see if a letter is lower case. If it is then the next lines are used. First it is converted into a location so that a = 0, b = 1, c = 2 and so on with the line: location = ord(character) - ord('a'). Next the new value is found with new_ascii = location + ord('A'). This value is converted back to a character that is now upper case.

Now for some interactive typing exercise:

>>> # Integer to String
>>> 2
2
>>> repr(2)
'2'
>>> -123
-123
>>> repr(-123)
'-123'
>>> `123`
'123'
>>> # String to Integer
>>> "23"
'23'
>>> int("23")
23
>>> "23" * 2
'2323'
>>> int("23") * 2
46
>>> # Float to String
>>> 1.23
1.23
>>> repr(1.23)
'1.23'
>>> # Float to Integer
>>> 1.23
1.23
>>> int(1.23)
1
>>> int(-1.23)
-1
>>> # String to Float
>>> float("1.23")
1.23
>>> "1.23" 
'1.23'
>>> float("123")
123.0
>>> `float("1.23")`
'1.23'

If you haven't guessed already the function repr() can convert a integer to a string and the function int() can convert a string to an integer. The function float() can convert a string to a float. The repr() function returns a printable representation of something. `...` converts almost everything into a string, too. Here are some examples of this:

>>> repr(1)
'1'
>>> repr(234.14)
'234.14'
>>> repr([4, 42, 10])
'[4, 42, 10]'
>>> `[4, 42, 10]`
'[4, 42, 10]'

The int() function tries to convert a string (or a float) into a integer. There is also a similar function called float() that will convert a integer or a string into a float. Another function that Python has is the eval() function. The eval() function takes a string and returns data of the type that python thinks it found. For example:

>>> v = eval('123')
>>> print v, type(v)
123 <type 'int'>
>>> v = eval('645.123')
>>> print v, type(v)
645.123 <type 'float'>
>>> v = eval('[1, 2, 3]')
>>> print v, type(v)
[1, 2, 3] <type 'list'>

If you use the eval() function you should check that it returns the type that you expect.

One useful string function is the split() method. Here's an example:

>>> "This is a bunch of words".split()
['This', 'is', 'a', 'bunch', 'of', 'words']
>>> text = "First batch, second batch, third, fourth"
>>> text.split(",")
['First batch', ' second batch', ' third', ' fourth']

Notice how split() converts a string into a list of strings. The string is split by whitespace by default or by the optional argument (in this case a comma). You can also add another argument that tells split() how many times the separator will be used to split the text. For example:

>>> list = text.split(",")
>>> len(list)
4
>>> list[-1]
' fourth'
>>> list = text.split(",", 2)
>>> len(list)
3
>>> list[-1]
' third, fourth'

Slicing strings (and lists)

[edit | edit source]

Strings can be cut into pieces — in the same way as it was shown for lists in the previous chapter — by using the slicing "operator" [:]. The slicing operator works in the same way as before: text[first_index:last_index] (in very rare cases there can be another colon and a third argument, as in the example shown below).

In order not to get confused by the index numbers, it is easiest to see them as clipping places, possibilities to cut a string into parts. Here is an example, which shows the clipping places (in yellow) and their index numbers (red and blue) for a simple text string:

0 1 2 ... -2 -1
text = " S T R I N G "
[: :]

Note that the red indexes are counted from the beginning of the string and the blue ones from the end of the string backwards. (Note that there is no blue -0, which could seem to be logical at the end of the string. Because -0 == 0, (-0 means "beginning of the string" as well.) Now we are ready to use the indexes for slicing operations:

text[1:4] "TRI"
text[:5] "STRIN"
text[:-1] "STRIN"
text[-4:] "RING"
text[2] "R"
text[:] "STRING"
text[::-1] "GNIRTS"

text[1:4] gives us all of the text string between clipping places 1 and 4, "TRI". If you omit one of the [first_index:last_index] arguments, you get the beginning or end of the string as default: text[:5] gives "STRIN". For both first_index and last_index we can use both the red and the blue numbering schema: text[:-1] gives the same as text[:5], because the index -1 is at the same place as 5 in this case. If we do not use an argument containing a colon, the number is treated in a different way: text[2] gives us one character following the second clipping point, "R". The special slicing operation text[:] means "from the beginning to the end" and produces a copy of the entire string (or list, as shown in the previous chapter).

Last but not least, the slicing operation can have a second colon and a third argument, which is interpreted as the "step size": text[::-1] is text from beginning to the end, with a step size of -1. -1 means "every character, but in the other direction". "STRING" backwards is "GNIRTS" (test a step length of 2, if you have not got the point here).

All these slicing operations work with lists as well. In that sense strings are just a special case of lists, where the list elements are single characters. Just remember the concept of clipping places, and the indexes for slicing things will get a lot less confusing.

Examples

[edit | edit source]
# This program requires an excellent understanding of decimal numbers
def to_string(in_int):
    """Converts an integer to a string"""
    out_str = ""
    prefix = ""
    if in_int < 0:
        prefix = "-"
        in_int = -in_int
    while in_int / 10 != 0:
        out_str = chr(ord('0') + in_int % 10) + out_str
        in_int = in_int / 10
    out_str = chr(ord('0') + in_int % 10) + out_str
    return prefix + out_str

def to_int(in_str):
    """Converts a string to an integer"""
    out_num = 0
    if in_str[0] == "-":
        multiplier = -1
        in_str = in_str[1:]
    else:
        multiplier = 1
    for x in range(0, len(in_str)):
        out_num = out_num * 10 + ord(in_str[x]) - ord('0')
    return out_num * multiplier

print to_string(2)
print to_string(23445)
print to_string(-23445)
print to_int("14234")
print to_int("12345")
print to_int("-3512")

The output is:

2
23445
-23445
14234
12345
-3512