CH 17-21
CH 17-21
Parameter Details
b Represents signed integer of size 1 byte
B Represents unsigned integer of size 1 byte
c Represents character of size 1 byte
u Represents unicode character of size 2 bytes
h Represents signed integer of size 2 bytes
H Represents unsigned integer of size 2 bytes
i Represents signed integer of size 2 bytes
I Represents unsigned integer of size 2 bytes
w Represents unicode character of size 4 bytes
l Represents signed integer of size 4 bytes
L Represents unsigned integer of size 4 bytes
f Represents floating point of size 4 bytes
d Represents floating point of size 8 bytes
"Arrays" in Python are not the arrays in conventional programming languages like C and Java, but closer to lists. A
list can be a collection of either homogeneous or heterogeneous elements, and may contain ints, strings or other
lists.
While python lists can contain values corresponding to different data types, arrays in python can only contain
values corresponding to same data type. In this tutorial, we will understand the Python arrays with few examples.
If you are new to Python, get started with the Python Introduction article.
To use arrays in python language, you need to import the standard array module. This is because array is not a
fundamental data type like strings, integer etc. Here is how you can import array module in python :
Once you have imported the array module, you can declare an array. Here is how you do it:
Typecodes are the codes that are used to define the type of array values or the type of array. The table in the
parameters section shows the possible values you can use when declaring an array and it's type.
my_array = array('i',[1,2,3,4])
In the example above, typecode used is i. This typecode represents signed integer whose size is 2 bytes.
Note that the value 6 was appended to the existing array values.
In the above example, the value 0 was inserted at index 0. Note that the first argument is the index while second
argument is the value.
We see that the array my_array was extended with values from my_extnd_array.
So we see that the values 11,12 and 13 were added from list c to my_array.
So we see that the last element (5) was popped out of array.
Section 17.9: Fetch any element through its index using index()
method
index() returns first index of the matching value. Remember that arrays are zero-indexed.
Note in that second example that only one index was returned, even though the value exists twice in the array
lst=[[1,2,3],[4,5,6],[7,8,9]]
here the outer list lst has three things in it. each of those things is another list: The first one is: [1,2,3], the second
one is: [4,5,6] and the third one is: [7,8,9]. You can access these lists the same way you would access another
other element of a list, like this:
print (lst[0])
#output: [1, 2, 3]
print (lst[1])
#output: [4, 5, 6]
print (lst[2])
#output: [7, 8, 9]
You can then access the different elements in each of those lists the same way:
print (lst[0][0])
#output: 1
print (lst[0][1])
#output: 2
Here the first number inside the [] brackets means get the list in that position. In the above example we used the
number 0 to mean get the list in the 0th position which is [1,2,3]. The second set of [] brackets means get the
item in that position from the inner list. In this case we used both 0 and 1 the 0th position in the list we got is the
number 1 and in the 1st position it is 2
You can also set values inside these lists the same way:
lst[0]=[10,11,12]
Now the list is [[10,11,12],[4,5,6],[7,8,9]]. In this example we changed the whole first list to be a completely
new list.
lst[1][2]=15
Now the list is [[10,11,12],[4,5,15],[7,8,9]]. In this example we changed a single element inside of one of the
inner lists. First we went into the list at position 1 and changed the element within it at position 2, which was 6 now
it's 15.
[[[111,112,113],[121,122,123],[131,132,133]],[[211,212,213],[221,222,223],[231,232,233]],[[311,312,
313],[321,322,323],[331,332,333]]]
[[[111,112,113],[121,122,123],[131,132,133]],\
[[211,212,213],[221,222,223],[231,232,233]],\
[[311,312,313],[321,322,323],[331,332,333]]]
By nesting the lists like this, you can extend to arbitrarily high dimensions.
print(myarray)
print(myarray[1])
print(myarray[2][1])
print(myarray[1][0][2])
etc.
myarray[1]=new_n-1_d_list
myarray[2][1]=new_n-2_d_list
myarray[1][0][2]=new_n-3_d_list #or a single number if you're dealing with 3D arrays
etc.
creating a dict
literal syntax
d = {} # empty dict
d = {'key': 'value'} # dict with initial values
# Also unpacking one or multiple dictionaries with the literal syntax is possible
dict comprehension
d = {k:v for k,v in [('key', 'value',)]}
modifying a dict
d['newkey'] = 42
d['new_list'] = [1, 2, 3]
d['new_dict'] = {'nested_dict': 1}
del d['newkey']
mydict = {}
mydict['not there']
One way to avoid key errors is to use the dict.get method, which allows you to specify a default value to return in
the case of an absent key.
Which returns mydict[key] if it exists, but otherwise returns default_value. Note that this doesn't add key to
mydict. So if you want to retain that key value pair, you should use mydict.setdefault(key, default_value),
which does store the key value pair.
mydict = {}
print(mydict)
# {}
print(mydict.get("foo", "bar"))
# bar
print(mydict)
# {}
print(mydict.setdefault("foo", "bar"))
# bar
print(mydict)
# {'foo': 'bar'}
try:
value = mydict[key]
except KeyError:
value = default_value
if key in mydict:
value = mydict[key]
else:
value = default_value
Do note, however, that in multi-threaded environments it is possible for the key to be removed from the dictionary
after you check, creating a race condition where the exception can still be thrown.
Another option is to use a subclass of dict, collections.defaultdict, that has a default_factory to create new entries in
the dict when given a new_key.
The items() method can be used to loop over both the key and value simultaneously:
While the values() method can be used to iterate over only the values, as would be expected:
Here, the methods keys(), values() and items() return lists, and there are the three extra methods iterkeys()
itervalues() and iteritems() to return iterators.
d = defaultdict(int)
d['key'] # 0
d['key'] = 5
d['key'] # 5
d = defaultdict(lambda: 'empty')
d['key'] # 'empty'
d['key'] = 'full'
d['key'] # 'full'
[*] Alternatively, if you must use the built-in dict class, using dict.setdefault() will allow you to create a default
whenever you access a key that did not exist before:
>>> d = {}
{}
>>> d.setdefault('Another_key', []).append("This worked!")
>>> d
Keep in mind that if you have many values to add, dict.setdefault() will create a new instance of the initial value
(in this example a []) every time it's called - which may create unnecessary workloads.
[*] Python Cookbook, 3rd edition, by David Beazley and Brian K. Jones (O’Reilly). Copyright 2013 David Beazley and Brian
Jones, 978-1-449-34037-7.
Python 3.5+
>>> fishdog = {**fish, **dog}
>>> fishdog
{'hands': 'paws', 'color': 'red', 'name': 'Clifford', 'special': 'gills'}
As this example demonstrates, duplicate keys map to their lattermost value (for example "Clifford" overrides
"Nemo").
Python 3.3+
>>> from collections import ChainMap
>>> dict(ChainMap(fish, dog))
{'hands': 'fins', 'color': 'red', 'special': 'gills', 'name': 'Nemo'}
With this technique the foremost value takes precedence for a given key rather than the last ("Clifford" is thrown
out in favor of "Nemo").
This uses the lattermost value, as with the **-based technique for merging ("Clifford" overrides "Nemo").
>>> fish.update(dog)
>>> fish
{'color': 'red', 'hands': 'paws', 'name': 'Clifford', 'special': 'gills'}
mydict = {
'a': '1',
print(mydict.keys())
# Python2: ['a', 'b']
# Python3: dict_keys(['b', 'a'])
print(mydict.values())
# Python2: ['1', '2']
# Python3: dict_values(['2', '1'])
If you want to work with both the key and its corresponding value, you can use the items() method:
print(mydict.items())
# Python2: [('a', '1'), ('b', '2')]
# Python3: dict_items([('b', '2'), ('a', '1')])
NOTE: Because a dict is unsorted, keys(), values(), and items() have no sort order. Use sort(), sorted(), or an
OrderedDict if you care about the order that these methods return.
Python 2/3 Difference: In Python 3, these methods return special iterable objects, not lists, and are the equivalent
of the Python 2 iterkeys(), itervalues(), and iteritems() methods. These objects can be used like lists for the
most part, though there are some differences. See PEP 3106 for more details.
The string "Hello" in this example is called a key. It is used to lookup a value in the dict by placing the key in
square brackets.
The number 1234 is seen after the respective colon in the dict definition. This is called the value that "Hello" maps
to in this dict.
Looking up a value like this with a key that does not exist will raise a KeyError exception, halting execution if
uncaught. If we want to access a value without risking a KeyError, we can use the dictionary.get method. By
default if the key does not exist, the method will return None. We can pass it a second value to return instead of
None in the event of a failed lookup.
w = dictionary.get("whatever")
x = dictionary.get("whatever", "nuh-uh")
In this example w will get the value None and x will get the value "nuh-uh".
Use OrderedDict from the collections module. This will always return the dictionary elements in the original
insertion order when iterated over.
d = OrderedDict()
d['first'] = 1
d['second'] = 2
d['third'] = 3
d['last'] = 4
>>>
This parrot wouldn't VOOM if you put four million volts through it. E's bleedin' demised !
As of Python 3.5 you can also use this syntax to merge an arbitrary number of dict objects.
As this example demonstrates, duplicate keys map to their lattermost value (for example "Clifford" overrides
"Nemo").
PEP 8 dictates that you should leave a space between the trailing comma and the closing brace.
car = {}
car["wheels"] = 4
car["color"] = "Red"
car["model"] = "Corvette"
# wheels: 4
# color: Red
# model: Corvette
Given a dictionary such as the one shown above, where there is a list representing a set of values to explore for the
corresponding key. Suppose you want to explore "x"="a" with "y"=10, then "x"="a" with"y"=10, and so on until
you have explored all possible combinations.
You can create a list that returns all such combinations of values using the following code.
import itertools
options = {
"x": ["a", "b"],
"y": [10, 20, 30]}
keys = options.keys()
values = (options[key] for key in keys)
combinations = [dict(zip(keys, combination)) for combination in itertools.product(*values)]
print combinations
a = [1, 2, 3, 4, 5]
# Append an element of a different type, as list elements do not need to have the same type
my_string = "hello world"
a.append(my_string)
# a: [1, 2, 3, 4, 5, 6, 7, 7, [8, 9], "hello world"]
Note that the append() method only appends one new element to the end of the list. If you append a list to
another list, the list that you append becomes a single element at the end of the first list.
a = [1, 2, 3, 4, 5, 6, 7, 7]
b = [8, 9, 10]
Lists can also be concatenated with the + operator. Note that this does not modify any of the original lists:
3. index(value, [startIndex]) – gets the index of the first occurrence of the input value. If the input value is
not in the list a ValueError exception is raised. If a second argument is provided, the search is started at that
specified index.
a.index(7)
# Returns: 6
a.index(7, 7)
# Returns: 7
4. insert(index, value) – inserts value just before the specified index. Thus after the insertion the new
element occupies position index.
5. pop([index]) – removes and returns the item at index. With no argument it removes and returns the last
element of the list.
a.pop(2)
# Returns: 5
# a: [0, 1, 2, 3, 4, 5, 6, 7, 7, 8, 9, 10]
a.pop(8)
# Returns: 7
# a: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# With no argument:
a.pop()
# Returns: 10
# a: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
6. remove(value) – removes the first occurrence of the specified value. If the provided value cannot be found, a
ValueError is raised.
a.remove(0)
a.remove(9)
# a: [1, 2, 3, 4, 5, 6, 7, 8]
a.remove(10)
# ValueError, because 10 is not in a
a.reverse()
# a: [8, 7, 6, 5, 4, 3, 2, 1]
a.count(7)
# Returns: 2
9. sort() – sorts the list in numerical and lexicographical order and returns None.
a.sort()
# a = [1, 2, 3, 4, 5, 6, 7, 8]
# Sorts the list in numerical order
Lists can also be reversed when sorted using the reverse=True flag in the sort() method.
a.sort(reverse=True)
# a = [8, 7, 6, 5, 4, 3, 2, 1]
If you want to sort by attributes of items, you can use the key keyword argument:
import datetime
class Person(object):
def __init__(self, name, birthday, height):
self.name = name
self.birthday = birthday
self.height = height
def __repr__(self):
return self.name
import datetime
import datetime
Lists can also be sorted using attrgetter and itemgetter functions from the operator module. These can help
improve readability and reusability. Here are some examples,
people = [{'name':'chandan','age':20,'salary':2000},
{'name':'chetan','age':18,'salary':5000},
{'name':'guru','age':30,'salary':3000}]
by_age = itemgetter('age')
by_salary = itemgetter('salary')
itemgetter can also be given an index. This is helpful if you want to sort based on indices of a tuple.
a.clear()
# a = []
11. Replication – multiplying an existing list by an integer will produce a larger list consisting of that many copies
of the original. This can be useful for example for list initialization:
b = ["blah"] * 3
# b = ["blah", "blah", "blah"]
Take care doing this if your list contains references to objects (eg a list of lists), see Common Pitfalls - List
multiplication and common references.
12. Element deletion – it is possible to delete multiple elements in the list using the del keyword and slice
notation:
a = list(range(10))
del a[::2]
# a = [1, 3, 5, 7, 9]
del a[-1]
# a = [1, 3, 5, 7]
del a[:]
# a = []
13. Copying
The default assignment "=" assigns a reference of the original list to the new name. That is, the original name
and new name are both pointing to the same list object. Changes made through any of them will be reflected
in another. This is often not what you intended.
b = a
a.append(6)
# b: [1, 2, 3, 4, 5, 6]
If you want to create a copy of the list you have below options.
new_list = old_list[:]
new_list = list(old_list)
import copy
new_list = copy.copy(old_list) #inserts references to the objects found in the original.
This is a little slower than list() because it has to find out the datatype of old_list first.
If the list contains objects and you want to copy them as well, use generic copy.deepcopy():
import copy
new_list = copy.deepcopy(old_list) #inserts copies of the objects found in the original.
Obviously the slowest and most memory-needing method, but sometimes unavoidable.
aa = a.copy()
# aa = [1, 2, 3, 4, 5]
lst = [1, 2, 3, 4]
lst[0] # 1
lst[1] # 2
Attempting to access an index outside the bounds of the list will raise an IndexError.
Negative indices are interpreted as counting from the end of the list.
lst[-1] # 4
lst[-2] # 3
lst[-5] # IndexError: list index out of range
lst[len(lst)-1] # 4
Lists allow to use slice notation as lst[start:end:step]. The output of the slice notation is a new list containing
elements from index start to end-1. If options are omitted start defaults to beginning of list, end to end of list and
step to 1:
lst[1:] # [2, 3, 4]
lst[:3] # [1, 2, 3]
lst[::2] # [1, 3]
lst[::-1] # [4, 3, 2, 1]
lst[-1:0:-1] # [4, 3, 2]
lst[5:8] # [] since starting index is greater than length of lst, returns empty list
lst[1:10] # [2, 3, 4] same as omitting ending index
With this in mind, you can print a reversed version of the list by calling
lst[::-1] # [4, 3, 2, 1]
When using step lengths of negative amounts, the starting index has to be greater than the ending index otherwise
the result will be an empty list.
lst[3:1:-1] # [4, 3]
reversed(lst)[0:2] # 0 = 1 -1
# 2 = 3 -1
The indices used are 1 less than those used in negative indexing and are reversed.
When lists are sliced the __getitem__() method of the list object is called, with a slice object. Python has a builtin
slice method to generate slice objects. We can use this to store a slice and reuse it later like so,
This can be of great use by providing slicing functionality to our objects by overriding __getitem__ in our class.
lst = []
if not lst:
print("list is empty")
# Output: foo
# Output: bar
# Output: baz
You can also get the position of each item at the same time:
for i in range(0,len(my_list)):
print(my_list[i])
#output:
>>>
foo
bar
Note that changing items in a list while iterating on it may have unexpected results:
# Output: foo
# Output: baz
In this last example, we deleted the first item at the first iteration, but that caused bar to be skipped.
'test' in lst
# Out: True
'toast' in lst
# Out: False
Note: the in operator on sets is asymptotically faster than on lists. If you need to use it many times on
potentially large lists, you may want to convert your list to a set, and test the presence of elements on
the set.
slst = set(lst)
'test' in slst
# Out: True
nums = [1, 1, 0, 1]
all(nums)
# False
chars = ['a', 'b', 'c', 'd']
all(chars)
# True
nums = [1, 1, 0, 1]
any(nums)
# True
vals = [None, None, None, False]
any(vals)
# False
While this example uses a list, it is important to note these built-ins work with any iterable, including generators.
In [4]: rev
Out[4]: [9, 8, 7, 6, 5, 4, 3, 2, 1]
Note that the list "numbers" remains unchanged by this operation, and remains in the same order it was originally.
You can also reverse a list (actually obtaining a copy, the original list is unaffected) by using the slicing syntax,
setting the third argument (the step) as -1:
In [2]: numbers[::-1]
Out[2]: [9, 8, 7, 6, 5, 4, 3, 2, 1]
2. zip returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument
sequences or iterables:
# Output:
# a1 b1
# a2 b2
# a3 b3
If the lists have different lengths then the result will include only as many elements as the shortest one:
# Output:
# a1 b1
alist = []
len(list(zip(alist, blist)))
# Output:
# 0
For padding lists of unequal length to the longest one with Nones use itertools.zip_longest
(itertools.izip_longest in Python 2)
# Output:
# a1 b1 c1
# a2 None c2
# a3 None c3
# None None c4
Output:
len() also works on strings, dictionaries, and other data structures similar to lists.
Also note that the cost of len() is O(1), meaning it will take the same amount of time to get the length of a list
regardless of its length.
import collections
>>> collections.OrderedDict.fromkeys(names).keys()
# Out: ['aixk', 'duke', 'edik', 'tofp']
If one of the lists is contained at the start of the other, the shortest list wins.
print(alist[0][0][1])
#2
#Accesses second element in the first list in the first list
print(alist[1][1][2])
#10
#Accesses the third element in the second list in the second list
alist[0][0].append(11)
print(alist[0][0][2])
#11
#Appends 11 to the end of the first list in the first list
Note that this operation can be used in a list comprehension or even as a generator to produce efficiencies, e.g.:
alist[1].insert(2, 15)
#Inserts 15 into the third position in the second list
Another way to use nested for loops. The other way is better but I've needed to use this on occasion:
#[1, 2, 11]
#[3, 4]
#[5, 6, 7]
#[8, 9, 10]
#15
#[12, 13, 14]
print(alist[1][1:])
#[[8, 9, 10], 15, [12, 13, 14]]
#Slices still work
print(alist)
#[[[1, 2, 11], [3, 4]], [[5, 6, 7], [8, 9, 10], 15, [12, 13, 14]]]
my_list = [None] * 10
my_list = ['test'] * 10
For mutable elements, the same construct will result in all elements of the list referring to the same object, for
example, for a set:
>>> my_list=[{1}] * 10
Instead, to initialize the list with a fixed number of different mutable objects, use:
Each <element> in the <iterable> is plugged in to the <expression> if the (optional) <condition> evaluates to true
. All results are returned at once in the new list. Generator expressions are evaluated lazily, but list comprehensions
evaluate the entire iterator immediately - consuming memory proportional to the iterator's length.
The for expression sets x to each value in turn from (1, 2, 3, 4). The result of the expression x * x is appended
to an internal list. The internal list is assigned to the variable squares when completed.
Besides a speed increase (as explained here), a list comprehension is roughly equivalent to the following for-loop:
squares = []
for x in (1, 2, 3, 4):
squares.append(x * x)
# squares: [1, 4, 9, 16]
else
else can be used in List comprehension constructs, but be careful regarding the syntax. The if/else clauses should
Note this uses a different language construct, a conditional expression, which itself is not part of the
comprehension syntax. Whereas the if after the for…in is a part of list comprehensions and used to filter
elements from the source iterable.
Double Iteration
Order of double iteration [... for x in ... for y in ...] is either natural or counter-intuitive. The rule of
thumb is to follow an equivalent for loop:
def foo(i):
return i, i + 0.5
for i in range(3):
for x in foo(i):
yield str(x)
This becomes:
[str(x)
for i in range(3)
for x in foo(i)
]
This can be compressed into one line as [str(x) for i in range(3) for x in foo(i)]
Before using list comprehension, understand the difference between functions called for their side effects
(mutating, or in-place functions) which usually return None, and functions that return an interesting value.
Many functions (especially pure functions) simply take an object and return some object. An in-place function
modifies the existing object, which is called a side effect. Other examples include input and output operations such
as printing.
list.sort() sorts a list in-place (meaning that it modifies the original list) and returns the value None. Therefore, it
won't work as expected in a list comprehension:
Using comprehensions for side-effects is possible, such as I/O or in-place functions. Yet a for loop is usually more
readable. While this works in Python 3:
Instead use:
In some situations, side effect functions are suitable for list comprehension. random.randrange() has the side
effect of changing the state of the random number generator, but it also returns an interesting value. Additionally,
next() can be called on an iterator.
The following random value generator is not pure, yet makes sense as the random generator is reset every time the
expression is evaluated:
More complicated list comprehensions can reach an undesired length, or become less readable. Although less
common in examples, it is possible to break a list comprehension into multiple lines like so:
[
x for x
in 'foo'
if x not in 'bar'
]
For each <element> in <iterable>; if <condition> evaluates to True, add <expression> (usually a function of
<element>) to the returned list.
For example, this can be used to extract only even numbers from a sequence of integers:
[x for x in range(10) if x % 2 == 0]
# Out: [0, 2, 4, 6, 8]
Live demo
even_numbers = []
print(even_numbers)
# Out: [0, 2, 4, 6, 8]
Also, a conditional list comprehension of the form [e for x in y if c] (where e and c are expressions in terms of
x) is equivalent to list(filter(lambda x: c, map(lambda x: e, y))).
Despite providing the same result, pay attention to the fact that the former example is almost 2x faster than the
latter one. For those who are curious, this is a nice explanation of the reason why.
Note that this is quite different from the ... if ... else ... conditional expression (sometimes known as a
ternary expression) that you can use for the <expression> part of the list comprehension. Consider the following
example:
Live demo
Here the conditional expression isn't a filter, but rather an operator determining the value to be used for the list
items:
Live demo
If you are using Python 2.7, xrange may be better than range for several reasons as described in the xrange
documentation.
numbers = []
for x in range(10):
if x % 2 == 0:
temp = x
else:
temp = -1
numbers.append(2 * temp + 1)
print(numbers)
# Out: [1, -1, 5, -1, 9, -1, 13, -1, 17, -1]
One can combine ternary expressions and if conditions. The ternary operator works on the filtered result:
See also: Filters, which often provide a sufficient alternative to conditional list comprehensions.
This results in two calls to f(x) for 1,000 values of x: one call for generating the value and the other for checking the
if condition. If f(x) is a particularly expensive operation, this can have significant performance implications.
Worse, if calling f() has side effects, it can have surprising results.
Instead, you should evaluate the expensive operation only once for each value of x by generating an intermediate
iterable (generator expression) as follows:
Another way that could result in a more readable code is to put the partial result (v in the previous example) in an
iterable (such as a list or a tuple) and then iterate over it. Since v will be the only element in the iterable, the result is
that we now have a reference to the output of our slow function computed only once:
However, in practice, the logic of code can be more complicated and it's important to keep it readable. In general, a
separate generator function is recommended over a complex one-liner:
Another way to prevent computing f(x) multiple times is to use the @functools.lru_cache()(Python 3.2+)
decorator on f(x). This way since the output of f for the input x has already been computed once, the second
reduce(lambda x, y: x+y, l)
sum(l, [])
list(itertools.chain(*l))
The shortcuts based on + (including the implied use in sum) are, of necessity, O(L^2) when there are L sublists -- as
the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated,
and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the
end). So (for simplicity and without actual loss of generality) say you have L sublists of I items each: the first I items
are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the
sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence
to the result list) also exactly once.
A basic example:
As with a list comprehension, we can use a conditional statement inside the dict comprehension to produce only
the dict elements meeting some criterion.
Starting with a dictionary and using dictionary comprehension as a key-value pair filter
If you have a dict containing simple hashable values (duplicate values may have unexpected results):
and you wanted to swap the keys and values you can take several approaches depending on your coding style:
print(swapped)
# Out: {a: 1, b: 2, c: 3}
If your dictionary is large, consider importing itertools and utilize izip or imap.
Merging Dictionaries
Combine dictionaries and optionally override old values with a nested dictionary comprehension.
{**dict1, **dict2}
# Out: {'w': 1, 'x': 2, 'y': 2, 'z': 2}
Note: dictionary comprehensions were added in Python 3.0 and backported to 2.7+, unlike list comprehensions,
which were added in 2.0. Versions < 2.7 can use generator expressions and the dict() builtin to simulate the
behavior of dictionary comprehensions.
For example, the following code flattening a list of lists using multiple for statements:
Live Demo
In both the expanded form and the list comprehension, the outer loop (first for statement) comes first.
In addition to being more compact, the nested comprehension is also significantly faster.
Inline ifs are nested similarly, and may occur in any position after the first for:
Live Demo
For the sake of readability, however, you should consider using traditional for-loops. This is especially true when
nesting is more than 2 levels deep, and/or the logic of the comprehension is too complex. multiple nested loop list
# list comprehension
[x**2 for x in range(10)]
# Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# generator comprehension
(x**2 for x in xrange(10))
# Output: <generator object <genexpr> at 0x11b4b7c80>
the list comprehension returns a list object whereas the generator comprehension returns a generator.
generator objects cannot be indexed and makes use of the next function to get items in order.
Note: We use xrange since it too creates a generator object. If we would use range, a list would be created. Also,
xrange exists only in later version of python 2. In python 3, range just returns a generator. For more information,
see the Differences between range and xrange functions example.
g.next() # 0
g.next() # 1
g.next() # 4
...
g.next() # 81
NOTE: The function g.next() should be substituted by next(g) and xrange with range since
Iterator.next() and xrange() do not exist in Python 3.
"""
Out:
0
1
4
...
81
"""
"""
Out:
0
1
4
.
.
.
81
"""
Use cases
Generator expressions are lazily evaluated, which means that they generate and return each value only when the
generator is iterated. This is often useful when iterating through large datasets, avoiding the need to create a
duplicate of the dataset in memory:
Another common use case is to avoid iterating over an entire iterable if doing so is not necessary. In this example,
an item is retrieved from a remote API with each iteration of get_objects(). Thousands of objects may exist, must
be retrieved one-by-one, and we only need to know if an object matching a pattern exists. By using a generator
expression, when we encounter an object matching the pattern.
def get_objects():
"""Gets objects from an API one by one"""
while True:
yield get_next_item()
def object_matches_pattern(obj):
# perform potentially complex calculation
return matches_pattern
def right_item_exists():
items = (object_matched_pattern(each) for each in get_objects())
for item in items:
if item.is_the_right_one:
return True
return False
Live Demo
Keep in mind that sets are unordered. This means that the order of the results in the set may differ from the one
presented in the above examples.
Note: Set comprehension is available since python 2.7+, unlike list comprehensions, which were added in 2.0. In
Python 2.2 to Python 2.6, the set() function can be used with a generator expression to produce the same result:
filter(P, S) is almost always written clearer as [x for x in S if P(x)], and this has the huge
advantage that the most common usages involve predicates that are comparisons, e.g. x==42, and
defining a lambda for that just requires much more effort for the reader (plus the lambda is slower than
the list comprehension). Even more so for map(F, S) which becomes [F(x) for x in S]. Of course, in
many cases you'd be able to use generator expressions instead.
The following lines of code are considered "not pythonic" and will raise errors in many python linters.
Taking what we have learned from the previous quote, we can break down these filter and map expressions into
their equivalent list comprehensions; also removing the lambda functions from each - making the code more
readable in the process.
# Map
# F(x) = 2*x
# S = range(10)
[2*x for x in range(10)]
Readability becomes even more apparent when dealing with chaining functions. Where due to readability, the
results of one map or filter function should be passed as a result to the next; with simple cases, these can be
replaced with a single list comprehension. Further, we can easily tell from the list comprehension what the outcome
of our process is, where there is more cognitive load when reasoning about the chained Map & Filter process.
# List comprehension
results = [2*x for x in range(10) if x % 2 == 0]
Map
Filter
where F and P are functions which respectively transform input values and return a bool
Note however, if the expression that begins the comprehension is a tuple then it must be parenthesized:
# Count the numbers in `range(1000)` that are even and contain the digit `9`:
print (sum(
1 for x in range(1000)
if x % 2 == 0 and
'9' in str(x)
))
# Out: 95
Note: Here we are not collecting the 1s in a list (note the absence of square brackets), but we are passing the ones
directly to the sum function that is summing them up. This is called a generator expression, which is similar to a
Comprehension.
l = []
for y in [3, 4, 5]:
temp = []
for x in [1, 2, 3]:
temp.append(x + y)
l.append(temp)
matrix = [[1,2,3],
[4,5,6],
[7,8,9]]
Like nested for loops, there is no limit to how deep comprehensions can be nested.
# Two lists
>>> [(i, j) for i, j in zip(list_1, list_2)]
[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]
# Three lists
>>> [(i, j, k) for i, j, k in zip(list_1, list_2, list_3)]
[(1, 'a', '6'), (2, 'b', '7'), (3, 'c', '8'), (4, 'd', '9')]
# so on ...