Python Programming/Idioms

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Python is a strongly idiomatic language: there is generally a single optimal way of doing something (a programming idiom), rather than many ways: “There’s more than one way to do it” is not a Python motto.

This section starts with some general principles, then goes through the language, highlighting how to idiomatically use operations, data types, and modules in the standard library.

Principles[edit | edit source]

Use exceptions for error-checking, following EAFP (It's Easier to Ask Forgiveness than Permission) instead of LBYL (Look Before You Leap): put an action that may fail inside a try...except block.

Use context managers for managing resources, like files. Use finally for ad hoc cleanup, but prefer to write a context manager to encapsulate this.

Use properties, not getter/setter pairs.

Use dictionaries for dynamic records, classes for static records (for simple classes, use collections.namedtuple): if a record always has the same fields, make this explicit in a class; if the fields may vary (be present or not), use a dictionary.

Use _ for throwaway variables, like discarding a return value when a tuple is returned, or to indicate that a parameter is being ignored (when required for an interface, say). You can use *_, **__ to discard positional or keyword arguments passed to a function: these correspond to the usual *args, **kwargs parameters, but explicitly discarded. You can also use these in addition to positional or named parameters (following the ones you use), allowing you to use some and discard any excess ones.

Use implicit True/False (truthy/falsy values), except when needing to distinguish between falsy values, like None, 0, and [], in which case use an explicit check like is None or == 0.

Use the optional else clause after try, for, while not just if.

Imports[edit | edit source]

For readable and robust code, only import modules, not names (like functions or classes), as this creates a new (name) binding, which is not necessarily in sync with the existing binding.[1] For example, given a module m which defines a function f, importing the function with from m import f means that m.f and f can differ if either is assigned to (creating a new binding).

In practice, this is frequently ignored, particularly for small-scale code, as changing a module post-import is rare, so this is rarely a problem, and both classes and functions are imported from modules so they can be referred to without a prefix. However, for robust, large-scale code, this is an important rule, as it risks creating very subtle bugs.

For robust code with low typing, one can use a renaming import to abbreviate a long module name:

import module_with_very_long_name as vl
vl.f()  # easier than module_with_very_long_name.f, but still robust

Note that importing submodules (or subpackages) from a package using from is completely fine:

from p import sm  # completely fine
sm.f()

Operations[edit | edit source]

Swap values
b, a = a, b
Attribute access on nullable value

To access an attribute (esp. to call a method) on a value that might be an object, or might be None, use the boolean shortcircuiting of and:

a and a.x
a and a.f()

Particularly useful for regex matches:

match and match.group(0)
in

in in can be used for substring checking

Data types[edit | edit source]

All sequence types[edit | edit source]

Indexing during iteration

Use enumerate() if you need to keep track of iteration cycles over an iterable:

for i, x in enumerate(l):
    # ...

Anti-idiom:

for i in range(len(l)):
    x = l[i]  # why did you go from list to numbers back to the list?
    # ...
Finding first matching element

Python sequences do have an index method, but this returns the index of the first occurrence of a specific value in the sequence. To find the first occurrence of a value that satisfies a condition, instead, use next and a generator expression:

try:
    x = next(i for i, n in enumerate(l) if n > 0)
except StopIteration:
    print('No positive numbers')
else:
    print('The index of the first positive number is', x)

If you need the value, not the index of its occurrence, you can get it directly through:

try:
    x = next(n for n in l if n > 0)
except StopIteration:
    print('No positive numbers')
else:
    print('The first positive number is', x)

The reason for this construct is twofold:

  • Exceptions let you signal “no match found” (they solve the semipredicate problem): since you're returning a single value (not an index), this can't be returned in the value.
  • Generator expressions let you use an expression without needing a lambda or introducing new grammar.
Truncating

For mutable sequences, use del, instead of reassigning to a slice:

del l[j:]
del l[:i]

Anti-idiom:

l = l[:j]
l = l[i:]

The simplest reason is that del makes your intention clear: you're truncating.

More subtly, slicing creates another reference to the same list (because lists are mutable), and then unreachable data can be garbage-collected, but generally this is done later. Deleting instead immediately modifies the list in-place (which is faster than creating a slice and then assigning it to the existing variable), and allows Python to immediately deallocate the deleted elements, instead of waiting for garbage collection.

In some cases you do want 2 slices of the same list – though this is rare in basic programming, other than iterating once over a slice in a for loop – but it's rare that you'll want to make a slice of a whole list, then replace the original list variable with a slice (but not change the other slice!), as in the following funny-looking code:

m = l
l = l[i:j]  # why not m = l[i:j] ?
Sorted list from an iterable

You can create a sorted list directly from any iterable, without needing to first make a list and then sort it. These include sets and dictionaries (iterate on the keys):

s = {1, 'a', ...}
l = sorted(s)
d = {'a': 1, ...}
l = sorted(d)

Tuples[edit | edit source]

Use tuples for constant sequences. This is rarely necessary (primarily when using as keys in a dictionary), but makes intention clear.

Strings[edit | edit source]

Substring

Use in for substring checking.

However, do not use in to check if a string is a single-character match, since it matches substrings and will return spurious matches – instead use a tuple of valid values. For example, the following is wrong:

def valid_sign(sign):
    return sign in '+-'  # wrong, returns true for sign == '+-'

Instead, use a tuple:

def valid_sign(sign):
    return sign in ('+', '-')
Building a string

To make a long string incrementally, build a list and then join it with '' – or with newlines, if building a text file (don't forget the final newline in this case!). This is faster and clearer than appending to a string, which is often slow. (In principle can be in overall length of string and number of additions, which is if pieces are of similar sizes.)

However, there are some optimizations in some versions CPython that make simple string appending fast – string appending in CPython 2.5+, and bytestring appending in CPython 3.0+ are fast, but for building Unicode strings (unicode in Python 2, string in Python 3), joining is faster. If doing extensive string manipulation, be aware of this and profile your code. See Performance Tips: String Concatenation and Concatenation Test Code for details.

Don't do this:

s = ''
for x in l:
    # this makes a new string every iteration, because strings are immutable
    s += x

Instead:

# ...
# l.append(x)
s = ''.join(l)

You can even use generator expressions, which are extremely efficient:

s = ''.join(f(x) for x in l)

If you do want a mutable string-like object, you can use StringIO.

Dictionaries[edit | edit source]

To iterate through a dictionary, either keys, values, or both:

# Iterate over keys
for k in d:
    ...

# Iterate over values, Python 3
for v in d.values():
    ...

# Iterate over values, Python 2
# In Python 2, dict.values() returns a copy
for v in d.itervalues():
    ...

# Iterate over keys and values, Python 3
for k, v in d.items():
    ...

# Iterate over values, Python 2
# In Python 2, dict.items() returns a copy
for k, v in d.iteritems():
    ...

Anti-patterns:

for k, _ in d.items():  # instead: for k in d:
    ...
for _, v in d.items():  # instead: for v in d.values()
    ...

FIXME:

  • setdefault
  • usually better to use collections.defaultdict

dict.get is useful, but using dict.get and then checking if it is None as a way of testing if the key is in the dictionary is an anti-idiom, as None is a potential value, and whether the key is in the dictionary can be checked directly. It's ok to use get and compare with None if this is not a potential value, however.

Simple:

if 'k' in d:
    # ... d['k']

Anti-idiom (unless None is not a potential value):

v = d.get('k')
if v is not None:
    # ... v
Dict from parallel sequences of keys and values

Use zip as: dict(zip(keys, values))

Modules[edit | edit source]

re[edit | edit source]

Match if found, else None:

match = re.match(r, s)
return match and match.group(0)

...returns None if no match, and the match contents if there is one.

References[edit | edit source]

Further reading[edit | edit source]