Python : Lists

Line of soldiers

The Python list object is the most general sequence provided by the language. Lists are positionally ordered collections of arbitrarily typed objects, and they have no fixed size. They are also mutable—unlike strings, lists can be modified in-place by assignment to offsets as well as a variety of list method calls.

Sequence Operations

Because they are sequences, lists support all the sequence operations we discussed for strings; the only difference is that the results are usually lists instead of strings. For instance, given a three-item list:

>>> l = [10,'hello',3.14]  # a list of three different-type objects
>>> l
[10, 'hello', 3.14]

>>> len(l)   # number of items in the list
>>> l[0]     # indexing by position

>>> l[:-1]   # slicing a list returns a new list
[10, 'hello']

>>> l + [1,2,3]  # concatenation makes a new list too
[10, 'hello', 3.14, 1, 2, 3]

>>> l    # we are not changing the original list
[10, 'hello', 3.14]


Type-Specific Operations:
Python’s lists are synonymous with arrays in other languages, but they tend to be more powerful. Like arrays lists can have objects of different types. Plus lists have no fixed size. That is, they can grow and shrink on demand, in response to list-specific operations.

The list append method expands the list’s size and inserts an item at the end; the
pop method (or an equivalent del statement) then removes an item at a given offset,
causing the list to shrink.

>>> l.append('magic')  # Growing: add the object at the end of list
>>> l
[10, 'hello', 3.14, 'magic']

>>> l.pop(2)  # Shrinking: delete an item in the middle

>>> l         # "del l[2]" deletes from a list too
[10, 'hello', 'magic']

Other list methods insert an item at an arbitrary position (insert), remove a given item by value (remove).

>>> l.insert(0,'occult') # inserts at the given offset

>>> l
['occult', 10, 'hello', 'magic']

>>> l.remove(10)  # removes the value specified if present

>>> l
['occult', 'hello', 'magic']

Since lists are mutable, most list methods also change the list object in-place, instead of creating a new one:

>>> k = ['bb','aa','cc']

>>> k.sort()  # sorts the items in ascending fashion

>>> k         # the list has modified
['aa', 'bb', 'cc']

>>> k.reverse() # reverses the position of items

>>> k           # the list has modified
['cc', 'bb', 'aa']

And assigning a value at a specific index works as usual.

>>> l = ['occult','hello', 'magic']

>>> l[1] = 13 # updates the value at the index

>>> l
['occult', 13, 'magic']


Bounds checking
Although lists have no fixed size, Python still doesn’t allow us to reference items that are not present. Indexing off the end of a list is always a mistake, as well as assigning off the end:

>>> l[55]
... error text ...
IndexError: list index out of range

>>> l[55] = 1
... error text ...
IndexError: list index out of range

This is intentional, as it’s usually an error to try to assign off the end of a list. Rather than silently growing the list in response (like C), Python reports an error. To grow a list, we call list methods such as append instead.

One nice feature of Python’s core data types is that they support arbitrary nesting — we can nest them in any combination, and as deeply as we like (for example, we can have a list that contains a dictionary, which contains another list, and so on). One immediate application of this feature is to represent matrices, or “multidimensional arrays” in Python. A list with nested lists will do the job for basic applications:

>>> M = [[1,2,3],   # a 3x3 matrix, as nested lists
         [4,5,6],   # code can span lines if bracketed

>>> M
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Here, we’ve coded a list that contains three other lists. The effect is to represent a 3 × 3 matrix of numbers. Such a structure can be accessed in a variety of ways:

>>> M[1]     # get row-2
[4, 5, 6]
>>> M[1][2]  # get row-2, then fetch the third item within the row

Stringing together index operations takes us deeper and deeper into our nested-object structure.

List Comprehensions
In addition to sequence operations and list methods, Python includes a more advanced operation known as a list comprehension expression, which turns out to be a powerful way to process structures like our matrix. Suppose, for instance, that we need to extract the second column of our sample matrix. It’s easy to grab rows by simple indexing because the matrix is stored by rows, but it’s almost as easy to get a column with a list comprehension:

>>> col2 = [row[1] for row in M]  # traverse each row and fetch the second item 

>>> col2
[2, 5, 8]

>>> M    # the matrix is unchanged
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

List comprehensions are derived from set notation; they are a way to build a new list by running an expression on each item in a sequence, one at a time, from left to right. List comprehensions are coded in square brackets (to tip you off to the fact that they make a list) and are composed of an expression and a looping construct that share a variable name (row, here). The preceding list comprehension means basically what it says: “Give me row[1] for each row in matrix M, in a new list.” The result is a new list containing column 2 of the matrix.

List comprehensions can be more complex in practice:

>>> [row[1] + 1 for row in M]    # Add 1 to each element in column-2
[3, 6, 9]

>>> [row[1] for row in M if row[1] % 2 == 0]  # Filter out odd ones
[2, 8]

The first operation here, for instance, adds 1 to each item as it is collected, and the second uses an if clause to filter odd numbers out of the result using the % modulus expression (remainder of division). List comprehensions make new lists of results, but they can be used to iterate over any iterable object. Here, for instance, we use list comprehensions to step over a hard coded list of coordinates and a string:

diag = >>> [M[i][i] for i in [0,1,2]]  # collect diagonal items
[1, 5, 9]

>>> doubles = [c * 2 for c in 'magic'] # repeat characters in a string

>>> doubles
['mm', 'aa', 'gg', 'ii', 'cc']

List comprehensions are an optional feature, but they tend to be handy in practice and often provide a substantial processing speed advantage. They also work on any type that is a sequence in Python, as well as some types that are not. (I’ll write more about comprehensions pretty soon.)

The comprehension syntax in parentheses can also be used to create generators that produce results on demand (the sum built-in, for instance, sums items in a sequence):

>>> G = (sum(row) for row in M) # create a generator of row sums

>>> type(G)
<type 'generator'>;

>>> next(G)  # run the iteration protocol
>>> next(G)
>>> next(G)

The map built-in can do similar work, by generating the results of running items through a function.

>>> map(sum,M)  # map sum over items in M
[6, 15, 24]

Furthermore, the comprehension syntax can be used to create sets and dictionaries.

>>> {sum(row) for row in M}         # create a set of row sums
set([24, 6, 15])

>>> {i:sum(M[i]) for i in [0,1,2]}  # create a key/value table of row sums
{0: 6, 1: 15, 2: 24}

In fact, lists, sets, and dictionaries can all be built with comprehensions.

>>> [ord(c) for c in 'deepak']     # List of character ordinals (ascii values)
[100, 101, 101, 112, 97, 107]

>>> {ord(c) for c in 'deepak'}     # Sets remove duplicates
set([112, 97, 107, 100, 101])

>>> {c:ord(c) for c in 'deepak'}   # Dictionary keys are unique
{'a': 97, 'p': 112, 'k': 107, 'e': 101, 'd': 100}


About Deepak Devanand

Seeker of knowledge
This entry was posted in Python and tagged , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s