Friday, March 15, 2013

Python Idioms

Source:
http://courses.cms.caltech.edu/cs11/material/python/misc/python_idioms.html

Every computer language has "idioms", that is, typical ways of accomplishing given tasks. Python is no exception. Some of the idioms are not that well known, so we thought we'd collect them here. We're also adding some material on other interesting features of the python language that you might miss when reading an introductory tutorial. These items are roughly in order of their difficulty and how commonly they're used.
WARNING! Some of this material is probably outdated. See the latest python documentation for, well, the latest python documentation.

See the documentation about...

These language features aren't really idioms, but you should know they exist:
  1. long integers
  2. optional arguments to functions
  3. keyword arguments to functions
  4. getattr and __getattr__ for classes
  5. operator overloading
  6. multiple inheritance
  7. docstrings
  8. the regular expression (re) library

Iterating through an array

Python's for statement is not like C's; it's more like a "foreach" statement in some other languages. If you need to use the loop index explicitly, the standard way is like this:
    array = [1, 2, 3, 4, 5]  # or whatever

    for i in range(len(array)):
        # Do something with 'i'.
This is quite clumsy. A somewhat cleaner way to do this is:
    array = [1, 2, 3, 4, 5]  # or whatever

    for i, e in enumerate(array):
        # Do something with index 'i' and its corresponding element 'e'.

Breaking out of an infinite loop

Python has no "do/while" loop like C does; it only has a while loop and a for loop. Sometimes you don't know in advance when the loop is going to be finished or you need to break out of the interior of a loop; the classic example is when iterating through the lines in a file. The standard idiom is this:
    file = open("some_filename", "r")

    while 1:   # infinite loop
        line = file.readline()
        if not line:  # 'readline()' returns None at end of file.
            break

        # Process the line.
This is admittedly clumsy, but it's still pretty standard. For files there is a nicer way:
    file = open("some_filename", "r")

    for line in file:
        # Process the line.
Note that python also has a continue statement (like in C) to jump to the next iteration of the current loop. Note also that the file() built-in function does the same thing as open() and is preferred nowadays (because the name of the constructor of an object should be the same as the name of the object).

Sequence multiplication

In python, lists and strings are both examples of sequences and many of the same operations (like len) work similarly for both of them. One non-obvious idiom is sequence multiplication; what this means is that to get a list of 100 zeroes, you can do this:
    zeroes = [0] * 100
Similarly, to get a string containing 100 spaces, you can do this:
    spaces = 100 * " "
This is often convenient.

xrange

Sometimes you want to generate a long list of numbers but you don't want to have to store all of them in memory at once. For instance, you might want to iterate from 0 to 1,000,000,000 but you don't want to store one billion integers in memory at once. Therefore, you don't want to use the range() built-in function. Instead, you can use the xrange function, which is a "lazy" version of range, meaning that it only generates the numbers on demand. So you could write:
    for i in xrange(1000000000):
        # do something with i...
and memory usage will be constant.

"Print to" syntax

Recently, the ">>" operator was overloaded so you can use it with the "print" statement as follows:
    print >> sys.stderr, "this is an error message"
The right-hand side of the ">>" operator is a file object. We personally consider this syntax to be a somewhat dubious addition to the language, but it's there, so you can use it if you want.

Exception classes

Back in the Bad Old Days, exceptions in python were simply strings. However, representing exceptions as classes has many advantages. In particular, you can subclass exceptions and selectively catch a particular exception or alternatively an exception and all of its superclasses. As a rule, exception classes are not very complicated. A typical exception class might look like this:
    class MyException:
        def __init__(self, value):
            self.value = value
        def __str__(self):
            return `self.value`
and will be used like this:
    try:
        do_stuff()
        if something_bad_has_happened():
            raise MyException, "something bad happened"
    except MyException, e:
        print "My exception occurred, value: ", e.value

List comprehensions

This is a fairly new addition to python, inspired by the functional programming language Haskell (which is a very cool language, by the way; you should check it out). The idea is this: sometimes you want to make a list of objects with some particular quality. For instance, you might want to make a list of the even integers between 0 and 20. Of course, you could do this:
    results = []
    for i in range(20):
        if i % 2 == 0:
            results.append(i)
and results would hold the list [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] (20 is not included because range(20) goes from 0 to 19). But with list comprehensions, you can do the same thing much more concisely:
    results = [x for x in range(20) if x % 2 == 0]
Basically, the list comprehension is syntactic sugar for the explicit loop. You can also do more complex stuff like this:
    results = [(x, y)
               for x in range(10)
               for y in range(10)
               if x + y == 5
               if x > y]
and results will be set to [(3, 2), (4, 1), (5, 0)]. So you can put any combination of for and if statements inside the square brackets (and maybe more; see the documentation for details). Using this, you can encode the quicksort algorithm very concisely as follows:
    def quicksort(lst):
        if len(lst) == 0:
            return []
        else:
            return quicksort([x for x in lst[1:] if x < lst[0]]) + [lst[0]] + \
                   quicksort([x for x in lst[1:] if x >= lst[0]])
Neat, huh? ;-)

Functional programming idioms

For some time now, python has possessed a few functions and features that are usually only found in functional programming languages like lisp or ML. These include:
  1. The map, reduce, and filter higher-order functions. The map function takes a function and a number of lists as arguments (usually just one) and applies the function to each element of the list, collecting the elements together into a new list. For instance, if you have a list of strings that represent integers (possibly from command-line argument list) and want to convert them to a list of integers, you can do this:

        lst = ["1", "2", "3", "4", "5"]
        nums = map(string.atoi, lst)  # [1, 2, 3, 4, 5]
    
    You can use map with functions of two arguments as well if you provide two lists:

        def add(x, y):
            return x + y
    
        lst1 = [1, 2, 3, 4, 5]
        lst2 = [6, 7, 8, 9, 10]
        lst_sum = map(add, lst1, lst2)
    
        # lst_sum == [7, 9, 11, 13, 15]
    
    You can use reduce to reduce a list to a single value by applying a function to the first two elements, then apply the same function to the result of the first function call and the next element, etc. until all the elements have been processed. This is often a convenient way to do things like sum a list:
        lst = [1, 2, 3, 4, 5]
        sum_lst = reduce(add, lst)  # == 1 + 2 + 3 + 4 + 5 == 15
    
    where 'add' is as defined above. You can use filter to create a list which contains a subset of the elements of an input list. For example, to get all the odd integers between 0 and 100, you can do this:
        nums = range(0,101)  # [0, 1, ... 100]
    
        def is_odd(x):
            return x % 2 == 1
    
        odd_nums = filter(is_odd, nums)  # [1, 3, 5, ... 99]
    
  2. The lambda keyword. A lambda statement represents an anonymous function i.e. a function with no name. If you look at the previous examples for map, reduce and filter, you'll see that they all use trivial one-line functions that are only used once. These can be more concisely expressed as lambda expressions:

        lst1 = [1, 2, 3, 4, 5]
        lst2 = [6, 7, 8, 9, 10]
        lst_elementwise_sum = map(lambda x, y: x + y, lst1, lst2)
        lst1_sum = reduce(lambda x, y: x + y, lst1)
        nums = range(101)
        odd_nums = filter(lambda x: x % 2 == 1, nums)
    
    Note that you can also use variables inside a lambda which were defined outside the lambda. This is called "lexical scoping" and was only introduced officially into the python language as of python 2.2. It works like this:

        a = 1
        add_a = lambda x: x + a
        b = add_a(10)  # b == 11
    
    The 'a' referred to in the lambda is the 'a' defined on the previous line. If this seems obvious, good! It turns out that getting this right has taken the python developers much longer than it should have.
    For more details on lambda, see any textbook on lisp or scheme.
  3. The apply function. Functions are objects in python; you can manipulate them just like you do numbers or strings (store them in variables, etc.). Sometimes you have a function value that you want to apply to an argument list which you have generated in the program; you can use the apply function for this:

        # Sorry about the long variable names ;-)
    
        args = function_returning_list_of_numbers()
        f    = function_returning_a_function_which_operates_on_a_list_of_numbers()
    
        # You want to do f(arg[0], arg[1], ...) but you don't know how many
        # arguments are in 'args'.  For this you have to use 'apply':
    
        result = apply(f, args)
    
        # A trivial example:
        args = [1, 1]
        two = apply(lambda x, y: x + y, args)  # == 2
    

Generators and iterators

This is an advanced (but very cool) topic that we don't have the space to go into here. If you're curious, look it up in the python documentation.

PEPs

The python community is very active; the newsgroup "comp.lang.python" is full of discussions of what features to add to the language. Occasionally somebody writes up a more detailed and formal suggestion of this sort as a "python enhancement proposal" or "PEP". These are archived here. Note that not all proposed PEPs are accepted into the language. However, they give a good idea of what the top python programmers feel are promising future directions for the language.

Learn python for fun.The popular blog with questions and answers to the python.Solutions to facebookhackercup,codejam,codechef.The fun way to learn python with me.Building some cool apps.

No comments:

Post a Comment