Graphics
Graphics
Graphics
141
Chapter 15
Up until now, the only way our programs have been able to interact with the user is through
keyboard input via the input statement. But most real programs use windows, buttons, scrollbars,
and various other things. These widgets are part of what is called a Graphical User Interface or GUI.
This chapter is about GUI programming in Python with Tkinter.
All of the widgets we will be looking at have far more options than we could possibly cover here.
An excellent reference is Fredrik Lundh’s Introduction to Tkinter [2].
15.1 Basics
Nearly every GUI program we will write will contain the following three lines:
from tkinter import *
root = Tk()
mainloop()
The first line imports all of the GUI stuff from the tkinter module. The second line creates a
window on the screen, which we call root. The third line puts the program into what is essentially
a long-running while loop called the event loop. This loop runs, waiting for keypresses, button
clicks, etc., and it exits when the user closes the window.
Here is a working GUI program that converts temperatures from Fahrenheit to Celsius.
def calculate():
temp = int(entry.get())
temp = 9/5*temp+32
output_label.configure(text = 'Converted: {:.1f} '.format(temp))
entry.delete(0,END)
143
144 CHAPTER 15. GUI PROGRAMMING WITH TKINTER
root = Tk()
message_label = Label(text= 'Enter a temperature ',
font=( 'Verdana ', 16))
output_label = Label(font=( 'Verdana ', 16))
entry = Entry(font=( 'Verdana ', 16), width=4)
calc_button = Button(text= 'Ok ', font=( 'Verdana ', 16),
command=calculate)
message_label.grid(row=0, column=0)
entry.grid(row=0, column=1)
calc_button.grid(row=0, column=2)
output_label.grid(row=1, column=0, columnspan=3)
mainloop()
15.2 Labels
A label is a place for your program to place some text on the screen. The following code creates a
label and places it on the screen.
hello_label = Label(text= 'hello ')
hello_label.grid(row=0, column=0)
We call Label to create a new label. The capital L is required. Our label’s name is hello_label.
Once created, use the grid method to place the label on the screen. We will explain grid in the
next section.
Options There are a number of options you can change including font size and color. Here are
some examples:
hello_label = Label(text= 'hello ', font=( 'Verdana ', 24, 'bold '),
bg= 'blue ', fg= 'white ')
Note the use of keyword arguments. Here are a few common options:
• font — The basic structure is font= (font name, font size, style). You can leave out the font
size or the style. The choices for style are 'bold', 'italic', 'underline', 'overstrike',
'roman', and 'normal' (which is the default). You can combine multiple styles like this:
'bold italic'.
15.3. GRID 145
• fg and bg — These stand for foreground and background. Many common color names
can be used, like 'blue', 'green', etc. Section 16.2 describes how to get essentially any
color.
• width — This is how many characters long the label should be. If you leave this out, Tkinter
will base the width off of the text you put in the label. This can make for unpredictable results,
so it is good to decide ahead of time how long you want your label to be and set the width
accordingly.
• height — This is how many rows high the label should be. You can use this for multi-
line labels. Use newline characters in the text to get it to span multiple lines. For example,
text='hi\nthere'.
There are dozens more options. The aforementioned Introduction to Tkinter [2] has a nice list of the
others and what they do.
Changing label properties Later in your program, after you’ve created a label, you may want to
change something about it. To do that, use its configure method. Here are two examples that
change the properties of a label called label:
label.configure(text= 'Bye ')
label.configure(bg= 'white ', fg= 'black ')
Setting text to something using the configure method is kind of like the GUI equivalent of a
print statement. However, in calls to configure we cannot use commas to separate multiple
things to print. We instead need to use string formatting. Here is a print statement and its equiv-
alent using the configure method.
print( 'a = ', a, 'and b = ', b)
label.configure(text= 'a = {}, and b = {} '.format(a,b))
The configure method works with most of the other widgets we will see.
15.3 grid
The grid method is used to place things on the screen. It lays out the screen as a rectangular grid
of rows and columns. The first few rows and columns are shown below.
(row=0, column=0) (row=0, column=1) (row=0, column=2)
(row=1, column=0) (row=1, column=1) (row=1, column=2)
(row=2, column=0) (row=2, column=1) (row=2, column=2)
Spanning multiple rows or columns There are optional arguments, rowspan and columnspan,
that allow a widget to take up more than one row or column. Here is an example of several grid
statements followed by what the layout will look like:
146 CHAPTER 15. GUI PROGRAMMING WITH TKINTER
label1.grid(row=0, column=0)
label2.grid(row=0, column=1)
label3.grid(row=1, column=0, columnspan=2)
label4.grid(row=1, column=2)
label5.grid(row=2, column=2)
label1 label2
label 3 label4
label5
Spacing To add extra space between widgets, there are optional arguments padx and pady.
Important note Any time you create a widget, to place it on the screen you need to use grid (or
one of its cousins, like pack, which we will talk about later). Otherwise it will not be visible.
Entry boxes are a way for your GUI to get text input. The following example creates a simple entry
box and places it on the screen.
entry = Entry()
entry.grid(row=0, column=0)
Most of the same options that work with labels work with entry boxes (and most of the other
widgets we will talk about). The width option is particularly helpful because the entry box will
often be wider than you need.
• Getting text To get the text from an entry box, use its get method. This will return a string.
If you need numerical data, use eval (or int or float) on the string. Here is a simple
example that gets text from an entry box named entry.
string_value = entry.get()
num_value = eval(entry.get())
• Inserting text To insert text into an entry box, use the following:
entry.insert(0, 'hello ')
15.5 Buttons
To get the button to do something when clicked, use the command argument. It is set to the name
of a function, called a callback function. When the button is clicked, the callback function is called.
Here is an example:
def callback():
label.configure(text= 'Button clicked ')
root = Tk()
label = Label(text= 'Not clicked ')
button = Button(text= 'Click me ', command=callback)
label.grid(row=0, column=0)
button.grid(row=1, column=0)
mainloop()
When the program starts, the label says Click me. When the button is clicked, the callback func-
tion callback is called, which changes the label to say Button clicked.
lambda trick Sometimes we will want to pass information to the callback function, like if we
have several buttons that use the same callback function and we want to give the function infor-
mation about which button is being clicked. Here is an example where we create 26 buttons, one
for each letter of the alphabet. Rather than use 26 separate Button() statements and 26 different
functions, we use a list and one function.
def callback(x):
label.configure(text= 'Button {} clicked '.format(alphabet[x]))
root = Tk()
label = Label()
label.grid(row=1, column=0, columnspan=26)
buttons[i].grid(row=0, column=i)
mainloop()
We note a few things about this program. First, we set buttons=[0]*26. This creates a list with 26
things in it. We don’t really care what thoset things are because they will be replaced with buttons.
An alternate way to create the list would be to set buttons=[] and use the append method.
We only use one callback function and it has one argument, which indicates which button was
clicked. As far as the lambda trick goes, without getting into the details, command=callback(i)
does not work, and that is why we resort to the lambda trick. You can read more about lambda in
Section 23.2. An alternate approach is to use classes.
Let’s say we want to keep track of how many times a button is clicked. An easy way to do this is to
use a global variable as shown below.
from tkinter import *
def callback():
global num_clicks
num_clicks = num_clicks + 1
label.configure(text= 'Clicked {} times. '.format(num_clicks))
num_clicks = 0
root = Tk()
label.grid(row=0, column=0)
button.grid(row=1, column=0)
mainloop()
We will be using a few global variables in our GUI programs. Using global variables unnecessarily,
especially in long programs, can cause difficult to find errors that make programs hard to maintain,
15.7. TIC-TAC-TOE 149
but in the short programs that we will be writing, we should be okay. Object-oriented programming
provides an alternative to global variables.
15.7 Tic-tac-toe
Using Tkinter, in only about 20 lines we can make a working tic-tac-toe program:
def callback(r,c):
global player
if player == 'X ':
b[r][c].configure(text = 'X ')
player = 'O '
else:
b[r][c].configure(text = 'O ')
player = 'X '
root = Tk()
b = [[0,0,0],
[0,0,0],
[0,0,0]]
for i in range(3):
for j in range(3):
b[i][j] = Button(font=( 'Verdana ', 56), width=3, bg= 'yellow ',
command = lambda r=i,c=j: callback(r,c))
b[i][j].grid(row = i, column = j)
mainloop()
The program works, though it does have a few problems, like letting you change a cell that already
has something in it. We will fix this shortly. First, let’s look at how the program does what it does.
Starting at the bottom, we have a variable player that keeps track of whose turn it is. Above that
we create the board, which consists of nine buttons stored in a two-dimensional list. We use the
lambda trick to pass the row and column of the clicked button to the callback function. In the
callback function we write an X or an O into the button that was clicked and change the value of
the global variable player.
150 CHAPTER 15. GUI PROGRAMMING WITH TKINTER
Correcting the problems To correct the problem about being able to change a cell that already
has something in it, we need to have a way of knowing which cells have X’s, which have O’s, and
which are empty. One way is to use a Button method to ask the button what its text is. Another
way, which we will do here is to create a new two-dimensional list, which we will call states, that
will keep track of things. Here is the code.
def callback(r,c):
global player
root = Tk()
states = [[0,0,0],
[0,0,0],
[0,0,0]]
b = [[0,0,0],
[0,0,0],
[0,0,0]]
for i in range(3):
for j in range(3):
b[i][j] = Button(font=( 'Verdana ', 56), width=3, bg= 'yellow ',
command = lambda r=i,c=j: callback(r,c))
b[i][j].grid(row = i, column = j)
15.7. TIC-TAC-TOE 151
mainloop()
We have not added much to the program. Most of the new action happens in the callback function.
Every time someone clicks on a cell, we first check to see if it is empty (that the corresponding index
in states is 0), and if it is, we display an X or O on the screen and record the new value in states.
Many games have a variable like states that keeps track of what is on the board.
Checking for a winner We have a winner when there are three X’s or three O’s in a row, either
vertically, horizontally, or diagonally. To check if there are three in a row across the top row, we can
use the following if statement:
if states[0][0]==states[0][1]==states[0][2]!=0:
stop_game=True
b[0][0].configure(bg= 'grey ')
b[0][1].configure(bg= 'grey ')
b[0][2].configure(bg= 'grey ')
This checks to see if each of the cells has the same nonzero entry. We are using the shortcut from
Section 10.3 here in the if statement. There are more verbose if statements that would work. If we
do find a winner, we highlight the winning cells and then set a global variable stop_game equal to
True. This variable will be used in the callback function. Whenever the variable is True we should
not allow any moves to take place.
Next, to check if there are three in a row across the middle row, change the first coordinate from 0
to 1 in all three references, and to check if there are three in a row across the bottom, change the 0’s
to 2’s. Since we will have three very similar if statements that only differ in one location, a for loop
can be used to keep the code short:
for i in range(3):
if states[i][0]==states[i][1]==states[i][2]!=0:
b[i][0].configure(bg= 'grey ')
b[i][1].configure(bg= 'grey ')
b[i][2].configure(bg= 'grey ')
stop_game = True
Next, checking for vertical winners is pretty much the same except we vary the second coordinate
instead of the first. Finally, we have two further if statements to take care of the diagonals. The full
program is at the end of this chapter. We have also added a few color options to the configure
statements to make the game look a little nicer.
Further improvements From here it would be easy to add a restart button. The callback function
for that variable should set stop_game back to false, it should set states back to all zeroes, and it
should configure all the buttons back to text='' and bg='yellow'.
To add a computer player would also not be too difficult, if you don’t mind it being a simple com-
152 CHAPTER 15. GUI PROGRAMMING WITH TKINTER
puter player that moves randomly. That would take about 10 lines of code. To make an intelligent
computer player is not too difficult. Such a computer player should look for two O’s or X’s in a
row in order to try to win or block, as well avoid getting put into a no-win situation.
def callback(r,c):
global player
check_for_winner()
def check_for_winner():
global stop_game
for i in range(3):
if states[i][0]==states[i][1]==states[i][2]!=0:
b[i][0].configure(bg= 'grey ')
b[i][1].configure(bg= 'grey ')
b[i][2].configure(bg= 'grey ')
stop_game = True
for i in range(3):
if states[0][i]==states[1][i]==states[2][i]!=0:
b[0][i].configure(bg= 'grey ')
b[1][i].configure(bg= 'grey ')
b[2][i].configure(bg= 'grey ')
stop_game = True
if states[0][0]==states[1][1]==states[2][2]!=0:
b[0][0].configure(bg= 'grey ')
b[1][1].configure(bg= 'grey ')
b[2][2].configure(bg= 'grey ')
stop_game = True
if states[2][0]==states[1][1]==states[0][2]!=0:
b[2][0].configure(bg= 'grey ')
b[1][1].configure(bg= 'grey ')
b[0][2].configure(bg= 'grey ')
stop_game = True
root = Tk()
b = [[0,0,0],
[0,0,0],
15.7. TIC-TAC-TOE 153
[0,0,0]]
states = [[0,0,0],
[0,0,0],
[0,0,0]]
for i in range(3):
for j in range(3):
b[i][j] = Button(font=( 'Verdana ', 56), width=3, bg= 'yellow ',
command = lambda r=i,c=j: callback(r,c))
b[i][j].grid(row = i, column = j)
mainloop()
154 CHAPTER 15. GUI PROGRAMMING WITH TKINTER
Chapter 16
GUI Programming II
16.1 Frames
Let’s say we want 26 small buttons across the top of the screen, and a big Ok button below them,
like below:
root = Tk()
mainloop()
155
156 CHAPTER 16. GUI PROGRAMMING II
The problem is with column 0. There are two widgets there, the A button and the Ok button, and
Tkinter will make that column big enough to handle the larger widget, the Ok button. One solution
to this problem is shown below:
ok_button.grid(row=1, column=0, columnspan=26)
Another solution to this problem is to use what is called a frame. The frame’s job is to hold other
widgets and essentially combine them into one large widget. In this case, we will create a frame to
group all of the letter buttons into one large widget. The code is shown below:
button_frame = Frame()
buttons = [0]*26
for i in range(26):
buttons[i] = Button(button_frame, text=alphabet[i])
buttons[i].grid(row=0, column=i)
button_frame.grid(row=0, column=0)
ok_button.grid(row=1, column=0)
mainloop()
To create a frame, we use Frame() and give it a name. Then, for any widgets we want include in
the frame, we include the name of the frame as the first argument in the widget’s declaration. We
still have to grid the widgets, but now the rows and columns will be relative to the frame. Finally,
we have to grid the frame itself.
16.2 Colors
Tkinter defines many common color names, like 'yellow' and 'red'. It also provides a way to
get access to millions of more colors. We first have to understand how colors are displayed on the
screen.
Each color is broken into three components—a red, a green, and a blue component. Each compo-
nent can have a value from 0 to 255, with 255 being the full amount of that color. Equal parts of red
and green create shades of yellow, equal parts of red and blue create shades of purple, and equal
16.3. IMAGES 157
parts of blue and green create shades of turquoise. Equal parts of all three create shades of gray.
Black is when all three components have values of 0 and white is when all three components have
values of 255. Varying the values of the components can produce up to 2563 ≈ 16 million colors.
There are a number of resources on the web that allow you to vary the amounts of the components
and see what color is produced.
To use colors in Tkinter is easy, but with one catch—component values are given in hexadecimal.
Hexadecimal is a base 16 number system, where the letters A-F are used to represent the digits 10
through 15. It was widely used in the early days of computing, and it is still used here and there.
Here is a table comparing the two number bases:
0 0 8 8 16 10 80 50
1 1 9 9 17 11 100 64
2 2 10 A 18 12 128 80
3 3 11 B 31 1F 160 A0
4 4 12 C 32 20 200 C8
5 5 13 D 33 21 254 FE
6 6 14 E 48 30 255 FF
7 7 15 F 64 40 256 100
Because the color component values run from 0 to 255, they will run from 0 to FF in hexadeci-
mal, and thus are described by two hex digits. A typical color in Tkinter is specified like this:
'#A202FF'. The color name is prefaced with a pound sign. Then the first two digits are the red
component (in this case A2, which is 162 in decimal). The next two digits specify the green compo-
nent (here 02, which is 2 in decimal), and the last two digits specify the blue component (here FF,
which is 255 in decimal). This color turns out to be a bluish violet. Here is an example of it in use:
label = Label(text= 'Hi ', bg= '#A202FF ')
If you would rather not bother with hexadecimal, you can use the following function which will
convert percentages into the hex string that Tkinter uses.
def color_convert(r, g, b):
return '#{:02x}{:02x}{:02x} '.format(int(r*2.55),int(g*2.55),
int(b*2.55))
Here is an example of it to create a background color that has 100% of the red component, 85% of
green and 80% of blue.
label = Label(text= 'Hi ', bg=color_convert(100, 85, 80))
16.3 Images
Labels and buttons (and other widgets) can display images instead of text.
To use an image requires a little set-up work. We first have to create a PhotoImage object and give
it a name. Here is an example:
cheetah_image = PhotoImage(file= 'cheetahs.gif ')
158 CHAPTER 16. GUI PROGRAMMING II
File types One unfortunate limitation of Tkinter is the only common image file type it can use is
GIF. If you would like to use other types of files, one solution is to use the Python Imaging Library,
which will be covered in Section 18.2.
16.4 Canvases
A canvas is a widget on which you can draw things like lines, circles, rectangles. You can also draw
text, images, and other widgets on it. It is a very versatile widget, though we will only describe the
basics here.
Creating canvases The following line creates a canvas with a white background that is 200 × 200
pixels in size:
canvas = Canvas(width=200, height=200, bg= 'white ')
See the image below on the left. The first four arguments specify the coordinates of where to place
the rectangle on the canvas. The upper left corner of the canvas is the origin, (0, 0). The upper left
of the rectangle is at (20, 100), and the lower right is at (30, 150). If were to leave off fill='red',
the result would be a rectangle with a black outline.
Ovals and lines Drawing ovals and lines is similar. The image above on the right is created with
the following code:
16.5. CHECK BUTTONS AND RADIO BUTTONS 159
canvas.create_rectangle(20,100,70,180)
canvas.create_oval(20,100,70,180, fill= 'blue ')
canvas.create_line(20,100,70,180, fill= 'green ')
The rectangle is here to show that lines and ovals work similarly to rectangles. The first two coor-
dinates are the upper left and the second two are the lower right.
To get a circle with radius r and center (x,y), we can create the following function:
def create_circle(x,y,r):
canvas.create_oval(x-r,y-r,x+r,y+r)
The two coordinates are where the center of the image should be.
Naming things, changing them, moving them, and deleting them We can give names to the
things we put on the canvas. We can then use the name to refer to the object in case we want to
move it or remove it from the canvas. Here is an example were we create a rectangle, change its
color, move it, and then delete it:
rect = canvas.create_rectangle(0,0,20,20)
canvas.itemconfigure(rect, fill= 'red ')
canvas.coords(rect,40,40,60,60)
canvas.delete(rect)
The coords method is used to move or resize an object and the delete method is used to delete
it. If you want to delete everything from the canvas, use the following:
canvas.delete(ALL)
In the image below, the top line shows a check button and the bottom line shows a radio button.
Check buttons The code for the above check button is:
show_totals = IntVar()
check = Checkbutton(text= 'Show totals ', var=show_totals)
160 CHAPTER 16. GUI PROGRAMMING II
The one thing to note here is that we have to tie the check button to a variable, and it can’t be
just any variable, it has to be a special kind of Tkinter variable, called an IntVar. This variable,
show_totals, will be 0 when the check button is unchecked and 1 when it is checked. To access
the value of the variable, you need to use its get method, like this:
show_totals.get()
You can also set the value of the variable using its set method. This will automatically check or
uncheck the check button on the screen. For instance, if you want the above check button checked
at the start of the program, do the following:
show_totals = IntVar()
show_totals.set(1)
check = Checkbutton(text= 'Show totals ', var=show_totals)
Radio buttons Radio buttons work similarly. The code for the radio buttons shown at the start of
the section is:
color = IntVar()
redbutton = Radiobutton(text= 'Red ', var=color, value=1)
greenbutton = Radiobutton(text= 'Green ', var=color, value=2)
bluebutton = Radiobutton(text= 'Blue ', var=color, value=3)
The value of the IntVar object color will be 1, 2, or 3, depending on whether the left, middle, or
right button is selected. These values are controlled by the value option, specified when we create
the radio buttons.
Commands Both check buttons and radio buttons have a command option, where you can set a
callback function to run whenever the button is selected or unselected.
The Text widget is a bigger, more powerful version of the Entry widget. Here is an example of
creating one:
textbox = Text(font=( 'Verdana ', 16), height=6, width=40)
The widget will be 40 characters wide and 6 rows tall. You can still type past the sixth row; the
widget will just display only six rows at a time, and you can use the arrow keys to scroll.
If you want a scrollbar associated with the text box you can use the ScrolledText widget. Other
than the scrollbar, ScrolledText works more or less the same as Text. An example is of what it
looks like is shown below. To use the ScrolledText widget, you will need the following import:
from tkinter.scrolledtext import ScrolledText
16.7. SCALE WIDGET 161
Statement Description
textbox.get(1.0,END) returns the contents of the text box
textbox.delete(1.0,END) deletes everything in the text box
textbox.insert(END,'Hello') inserts text at the end of the text box
One nice option when declaring the Text widget is undo=True, which allows Ctrl+Z and Ctrl+Y
to undo and redo edits. There are a ton of other things you can do with the Text widget. It is
almost like a miniature word processor.
A Scale is a widget that you can slide back and forth to select different values. An example is
shown below, followed by the code that creates it.
Option Description
from_ minimum value possible by dragging the scale
to_ maximum value possible by dragging the scale
length how many pixels long the scale is
label specify a label for the scale
showvalue='NO' gets rid of the number that displays above the scale
tickinterval=1 displays tickmarks at every unit (1 can be changed)
There are several ways for your program to interact with the scale. One way is to link it with
an IntVar just like with check buttons and radio buttons, using the variable option. Another
option is to use the scale’s get and set methods. A third way is to use the command option, which
162 CHAPTER 16. GUI PROGRAMMING II
Often we will want our programs to do something if the user presses a certain key, drags something
on a canvas, uses the mouse wheel, etc. These things are called events.
A simple example The first GUI program we looked at back in Section 15.1 was a simple temper-
ature converter. Anytime we wanted to convert a temperature we would type in the temperature
in the entry box and click the Calculate button. It would be nice if the user could just press the
enter key after they type the temperature instead of having to click to Calculate button. We can
accomplish this by adding one line to the program:
This line should go right after you declare the entry box. What it does is it takes the event that the
enter (return) key is pressed and binds it to the calculate function.
Well, sort of. The function you bind the event to is supposed to be able to receive a copy of an Event
object, but the calculate function that we had previously written takes no arguments. Rather than
rewrite the function, the line above uses lambda trick to essentially throw away the Event object.
Event Description
<Button-1> The left mouse button is clicked.
<Double-Button-1> The left mouse button is double-clicked.
<Button-Release-1> The left mouse button is released.
<B1-Motion> A click-and-drag with the left mouse button.
<MouseWheel> The mouse wheel is moved.
<Motion> The mouse is moved.
<Enter> The mouse is now over the widget.
<Leave> The mouse has now left the widget.
<Key> A key is pressed.
<key name> The key name key is pressed.
For all of the mouse button examples, the number 1 can be replaced with other numbers. Button 2
is the middle button and button 3 is the right button.
Attribute Description
keysym The name of the key that was pressed
x, y The coordinates of the mouse pointer
delta The value of the mouse wheel
Key events For key events, you can either have specific callbacks for different keys or catch all
keypresses and deal with them in the same callback. Here is an example of the latter:
def callback(event):
print(event.keysym)
root = Tk()
root.bind( '<Key> ', callback)
mainloop()
The above program prints out the names of the keys that were pressed. You can use those names
in if statements to handle several different keypresses in the callback function, like below:
if event.keysym == 'percent ':
# percent (shift+5) was pressed, do something about it...
elif event.keysym == 'a ':
# lowercase a was pressed, do something about it...
Use the single callback method if you are catching a lot of keypresses and are doing something
similar with all of them. On the other hand, if you just want to catch a couple of specific keypresses
or if certain keys have very long and specific callbacks, you can catch keypresses separately like
below:
def callback1(event):
print( 'You pressed the enter key. ')
def callback2(event):
print( 'You pressed the up arrow. ')
root = Tk()
root.bind( '<Return> ', callback1)
root.bind( '<Up> ', callback2)
mainloop()
The key names are the same as the names stored in the keysym attribute. You can use the program
from earlier in this section to find the names of all the keys. Here are the names for a few common
keys:
164 CHAPTER 16. GUI PROGRAMMING II
The exceptions are the spacebar (<Space>) and the less than sign (<Less>). You can also catch key
combinations, such as <Shift-F5>, <Control-Next>, <Alt-2>, or <Control-Shift-F1>.
Note These examples all bind keypresses to root, which is our name for the main window. You
can also bind keypresses to specific widgets. For instance, if you only want the left arrow key to
work on a Canvas called canvas, you could use the following:
canvas.bind(<Left>, callback)
One trick here, though, is that the canvas won’t recognize the keypress unless it has the GUI’s focus.
This can be done as below:
canvas.focus_set()
Example 1 Here is an example where the user can move a rectangle with the left or right arrow
keys.
def callback(event):
global move
if event.keysym== 'Right ':
16.9. EVENT EXAMPLES 165
move += 1
elif event.keysym== 'Left ':
move -=1
canvas.coords(rect,50+move,50,100+move,100)
root = Tk()
root.bind( '<Key> ', callback)
canvas = Canvas(width=200,height=200)
canvas.grid(row=0,column=0)
rect = canvas.create_rectangle(50,50,100,100,fill= 'blue ')
move = 0
mainloop()
Example 2 Here is an example program demonstrating mouse events. The program starts by
drawing a rectangle to the screen. The user can do the following:
def mouse_motion_event(event):
label.configure(text= '({}, {}) '.format(event.x, event.y))
def wheel_event(event):
global x1, x2, y1, y2
if event.delta>0:
diff = 1
elif event.delta<0:
diff = -1
x1+=diff
x2-=diff
y1+=diff
y2-=diff
canvas.coords(rect,x1,y1,x2,y2)
def b1_event(event):
global color
if not b1_drag:
color = 'Red ' if color== 'Blue ' else 'Blue '
canvas.itemconfigure(rect, fill=color)
166 CHAPTER 16. GUI PROGRAMMING II
def b1_motion_event(event):
global b1_drag, x1, x2, y1, y2, mouse_x, mouse_y
x = event.x
y = event.y
if not b1_drag:
mouse_x = x
mouse_y = y
b1_drag = True
return
x1+=(x-mouse_x)
x2+=(x-mouse_x)
y1+=(y-mouse_y)
y2+=(y-mouse_y)
canvas.coords(rect,x1,y1,x2,y2)
mouse_x = x
mouse_y = y
def b1_release_event(event):
global b1_drag
b1_drag = False
root=Tk()
label = Label()
canvas.grid(row=0, column=0)
label.grid(row=1, column=0)
mouse_x = 0
mouse_y = 0
b1_drag = False
x1 = y1 = 50
x2 = y2 = 100
color = 'blue '
rect = canvas.create_rectangle(x1,y1,x2,y2,fill=color)
mainloop()
16.9. EVENT EXAMPLES 167
1. First, every time the mouse is moved over the canvas, the mouse_motion_event function is
called. This function prints the mouse’s current coordinates which are contained in the Event
attributes x and y.
2. The wheel_event function is called whenever the user uses the mouse (scrolling) wheel.
The Event attribute delta contains information about how quickly and in what direction
the wheel was moved. We just stretch or shrink the rectangle based on whether the wheel
was moved forward or backward.
3. The b1_event function is called whenever the user presses the left mouse button. The func-
tion changes the color of the rectangle whenever the rectangle is clicked. There is a global
variable here called b1_drag that is important. It is set to True whenever the user is dragging
the rectangle. When dragging is going on, the left mouse button is down and the b1_event
function is continuously being called. We don’t want to keep changing the color of the rect-
angle in that case, hence the if statement.
5. The focus_set method is needed because the canvas will not recognize the mouse wheel
events unless the focus is on the canvas.
6. One problem with this program is that the user can modify the rectangle by clicking anywhere
on the canvas, not just on rectangle itself. If we only want the changes to happen when the
mouse is over the rectangle, we could specifically bind the rectangle instead of the whole
canvas, like below:
canvas.tag_bind(rect, '<B1-Motion> ', b1_motion_event)
168 CHAPTER 16. GUI PROGRAMMING II
7. Finally, the use of global variables here is a little messy. If this were part of a larger project, it
might make sense to wrap all of this up into a class.
Chapter 17
The GUI window that Tkinter creates says Tk by default. Here is how to change it:
root.title( 'Your title ')
Sometimes you want to disable a button so it can’t be clicked. Buttons have an attribute state that
allows you to disable the widget. Use state=DISABLED to disable the button and state=NORMAL
to enable it. Here is an example that creates a button that starts out disabled and then enables it:
button = Button(text= 'Hi ', state=DISABLED, command=function)
button.configure(state=NORMAL)
You can use the state attribute to disable many other types of widgets, too.
Sometimes, you need to know things about a widget, like exactly what text is in it or what its
background color is. The cget method is used for this. For example, the following gets the text of
a label called label:
label.cget( 'text ')
169
170 CHAPTER 17. GUI PROGRAMMING III
This can be used with buttons, canvases, etc., and it can be used with any of their properties, like
bg, fg, state, etc. As a shortcut, Tkinter overrides the [] operators, so that label['text']
accomplishes the same thing as the example above.
Message boxes are windows that pop up to ask you a question or say something and then go away.
To use them, we need an import statement:
There are a variety of different types of message boxes. For each of them you can specify the
message the user will see as well as the title of the message box. Here are three types of message
boxes, followed by the code that generates them:
Below is a list of all the types of message boxes. Each displays a message in its own way.
To get rid of a widget, use its destroy method. For instance, to get rid of a button called button,
do the following:
button.destroy()
Stopping a window from being closed When your user tries to close the main window, you may
want to do something, like ask them if they really want to quit. Here is a way to do that:
from tkinter import *
from tkinter.messagebox import askquestion
def quitter_function():
answer = askquestion(title= 'Quit? ', message= 'Really quit? ')
if answer== 'yes ':
root.destroy()
root = Tk()
root.protocol( 'WM_DELETE_WINDOW ', quitter_function)
mainloop()
The key is the following line, which cause quitter_function to be called whenever the user tries
to close the window.
root.protocol( 'WM_DELETE_WINDOW ', quitter_function)
17.6 Updating
Tkinter updates the screen every so often, but sometimes that is not often enough. For instance, in
a function triggered by a button press, Tkinter will not update the screen until the function is done.
172 CHAPTER 17. GUI PROGRAMMING III
If, in that function, you want to change something on the screen, pause for a short while, and then
change something else, you will need to tell Tkinter to update the screen before the pause. To do
that, just use this:
root.update()
If you only want to update a certain widget, and nothing else, you can use the update method of
that widget. For example,
canvas.update()
A related thing that is occasionally useful is to have something happen after a scheduled time
interval. For instance, you might have a timer in your program. For this, you can use the after
method. Its first argument is the time in milliseconds to wait before updating and the second
argument is the function to call when the time is right. Here is an example that implements a timer:
def update_timer():
time_left = int(90 - (time()-start))
minutes = time_left // 60
seconds = time_left % 60
time_label.configure(text= '{}:{:02d} '.format(minutes, seconds))
root.after(100, update_timer)
root = Tk()
time_label = Label()
time_label.grid(row=0, column=0)
start = time()
update_timer()
mainloop()
This example uses the time module, which is covered in Section 20.2.
17.7 Dialogs
Many programs have dialog boxes that allow the user to pick a file to open or to save a file. To use
them in Tkinter, we need the following import statement:
Tkinter dialogs usually look like the ones that are native to the operating system.
17.7. DIALOGS 173
Dialog Description
askopenfilename Opens a typical file chooser dialog
askopenfilenames Like previous, but user can pick more than one file
asksaveasfilename Opens a typical file save dialog
askdirectory Opens a directory chooser dialog
The return value of askopenfilename and asksaveasfilename is the name of the file selected.
There is no return value if the user does not pick a value. The return value of askopenfilenames
is a list of files, which is empty if no files are selected. The askdirectory function returns the
name of the directory chosen.
There are some options you can pass to these functions. You can set initialdir to the directory
you want the dialog to start in. You can also specify the file types. Here is an example:
filename=askopenfilename(initialdir= 'c:\\python31\\ ',
filetypes=[( 'Image files ', '.jpg .png .gif '),
( 'All files ', '* ')])
A short example Here is an example that opens a file dialog that allows you to select a text file.
The program then displays the contents of the file in a textbox.
from tkinter import *
from tkinter.filedialog import *
from tkinter.scrolledtext import ScrolledText
root = Tk()
textbox = ScrolledText()
textbox.grid()
mainloop()
We can create a menu bar, like the one below, across the top of a window.
Here is an example that uses some of the dialogs from the previous section:
from tkinter import *
from tkinter.filedialog import *
def open_callback():
filename = askopenfilename()
# add code here to do something with filename
def saveas_callback():
filename = asksaveasfilename()
# add code here to do something with filename
root = Tk()
menu = Menu()
root.config(menu=menu)
file_menu = Menu(menu, tearoff=0)
file_menu.add_command(label= 'Open ', command=open_callback)
file_menu.add_command(label= 'Save as ', command=saveas_callback)
file_menu.add_separator()
file_menu.add_command(label= 'Exit ', command=root.destroy)
menu.add_cascade(label= 'File ', menu=file_menu)
mainloop()
You can add widgets to the new window. The first argument when you create the widget needs to
be the name of the window, like below
new_window = Toplevel()
label = Label(new_window, text= 'Hi ')
label.grid(row=0, column=0)
17.10 pack
There is an alternative to grid called pack. It is not as versatile as grid, but there are some places
where it is useful. It uses an argument called side, which allows you to specify four locations for
your widgets: TOP, BOTTOM, LEFT, and RIGHT. There are two useful optional arguments, fill and
expand. Here is an example.
button1=Button(text= 'Hi ')
button1.pack(side=TOP, fill=X)
button2=Button(text= 'Hi ')
button2.pack(side=BOTTOM)
The fill option causes the widget to fill up the available space given to it. It can be either X, Y or
BOTH. The expand option is used to allow the widget to expand when its window is resized. To
enable it, use expand=YES.
Note You can use pack for some frames, and grid for others; just don’t mix pack and grid
within the same frame, or Tkinter won’t know quite what to do.
17.11 StringVar
In Section 16.5 we saw how to tie a Tkinter variable, called an IntVar, to a check button or a radio
button. Tkinter has another type of variable called a StringVar that holds strings. This type of
variable can be used to change the text in a label or a button or in some other widgets. We already
know how to change text using the configure method, and a StringVar provides another way
to do it.
To tie a widget to a StringVar, use the textvariable option of the widget. A StringVar has
get and set methods, just like an IntVar, and whenever you set the variable, any widgets that
are tied to it are automatically updated.
176 CHAPTER 17. GUI PROGRAMMING III
Here is a simple example that ties two labels to the same StringVar. There is also a button that
when clicked will alternate the value of the StringVar (and hence the text in the labels).
def callback():
global count
s.set( 'Goodbye ' if count%2==0 else 'Hello ')
count +=1
root = Tk()
count = 0
s = StringVar()
s.set( 'Hello ')
label1.grid(row=0, column=0)
label2.grid(row=0, column=1)
button.grid(row=1, column=0)
mainloop()
We have left out quite a lot about Tkinter. See Lundh’s Introduction to Tkinter [2] for more. Tkinter
is versatile and simple to work with, but if you need something more powerful, there are other
third-party GUIs for Python.
Chapter 18
As of this writing, the most recent version of Python is 3.2, and all the code in this book is designed
to run in Python 3.2. The tricky thing is that as of version 3.0, Python broke compatibility with
older versions of Python. Code written in those older versions will not always work in Python 3.
The problem with this is there were a number of useful libraries written for Python 2 that, as of this
writing, have not yet been ported to Python 3. We want to use these libraries, so we will have to
learn a little about Python 2. Fortunately, there are only a few big differences that we have to worry
about.
Division The division operator, /, in Python 2, when used with integers, behaves like //. For
instance, 5/4 in Python 2 evaluates to 1, whereas 5/4 in Python 3 evaluates to 1.2. This is the
way the division operator behaves in a number of other programming languages. In Python 3, the
decision was made to make the division operator behave the way we are used from math.
In Python 2, if you want to get 1.25 by dividing 5 and 4, you need to do 5/4.0. At least one of the
arguments has to be a float in order for the result to be a float. If you are dividing two variables,
then instead of x/y, you may need to do x/float(y).
print The print function in Python 3 was actually the print statement in Python 2. So in
Python 2, you would write
without any parentheses. This code will no longer work in Python 3 because the print statement
is now the print function, and functions need parentheses. Also, the current print function has
those useful optional arguments, sep and end, that are not available in Python 2.
177
178 CHAPTER 18. FURTHER GRAPHICAL PROGRAMMING
range The range function can be inefficient with very large ranges in Python 2. The reason is
that in Python 2, if you use range(10000000), Python will create a list of 10 million numbers.
The range statement in Python 3 is more efficient and instead of generating all 10 million things
at once, it only generates the numbers as it needs them. The Python 2 function that acts like the
Python 3 range is xrange.
String formatting String formatting in Python 2 is a little different than in Python 3. When using
the formatting codes inside curly braces, in Python 2, you need to specify an argument number.
Compare the examples below:
Python 2: 'x={0:3d},y={1:3d},z={2:3d}'.format(x,y,z)
Python 3: 'x={:3d},y={:3d}, z={:3d}'.format(x,y,z)
There is also an older style of formatting that you may see from time to time that uses the % operator.
An example is shown below along with the corresponding new style.
Module names Some modules were renamed and reorganized. Here are a few Tkinter name
changes:
Python 2 Python 3
Tkinter tkinter
ScrolledText tkinter.scrolledtext
tkMessageBox tkinter.messagebox
tkFileDialog tkinter.filedialog
There are a number of other modules we’ll see later that were renamed, mostly just changed to
lowercase. For instance, Queue in Python 2 is now queue in Python 3.
Other changes There are quite a few other changes in the language, but most of them are with
features more advanced than we consider here.
18.2. THE PYTHON IMAGING LIBRARY 179
Importing future behavior The following import allows us to use Python 3’s division behavior
in Python 2.
from __future__ import division
There are many other things you can import from the future.
The Python Imaging Library (PIL) contains useful tools for working with images. As of this writing,
the PIL is only available for Python 2.7 or earlier. The PIL is not part of the standard Python
distribution, so you’ll have to download and install it separately. It’s easy to install, though.
PIL hasn’t been maintained since 2009, but there is a project called Pillow that it nearly compatible
with PIL and works in Python 3.0 and later.
We will cover just a few features of the PIL here. A good reference is The Python Imaging Library
Handbook.
Using images other than GIFs with Tkinter Tkinter, as we’ve seen, can’t use JPEGs and PNGs.
But it can if we use it in conjunction with the PIL. Here is a simple example:
from Tkinter import *
from PIL import Image, ImageTk
root = Tk()
cheetah_image = ImageTk.PhotoImage(Image.open( 'cheetah.jpg '))
button = Button(image=cheetah_image)
button.grid(row=0, column=0)
mainloop()
The first line imports Tkinter. Remember that in Python 2 it’s an uppercase Tkinter. The next
line imports a few things from the PIL. Next, where we would have used Tkinter’s PhotoImage to
load an image, we instead use a combination of two PIL functions. We can then use the image like
normal in our widgets.
Images PIL is the Python Imaging Library, and so it contains a lot of facilities for working with
images. We will just show a simple example here. The program below displays a photo on a canvas
and when the user clicks a button, the image is converted to grayscale.
def change():
global image, photo
pix = image.load()
180 CHAPTER 18. FURTHER GRAPHICAL PROGRAMMING
for i in range(photo.width()):
for j in range(photo.height()):
red,green,blue = pix[i,j]
avg = (red+green+blue)//3
pix[i,j] = (avg, avg, avg)
photo=ImageTk.PhotoImage(image)
canvas.create_image(0,0,image=photo,anchor=NW)
def load_file(filename):
global image, photo
image=Image.open(filename).convert( 'RGB ')
photo=ImageTk.PhotoImage(image)
canvas.configure(width=photo.width(), height=photo.height())
canvas.create_image(0,0,image=photo,anchor=NW)
root.title(filename)
root = Tk()
button = Button(text= 'Change ', font=( 'Verdana ', 18), command=change)
canvas = Canvas()
canvas.grid(row=0)
button.grid(row=1)
load_file( 'pic.jpg ')
mainloop()
Let’s first look at the load_file function. Many of the image utilities are in the Image module. We
give a name, image, to the object created by the Image.open statement. We also use the convert
method to convert the image into RGB (Red-Green-Blue) format. We will see why in a minute. The
next line creates an ImageTk object called photo that gets drawn to the Tkinter canvas. The photo
object has methods that allow us to get its width and height so we can size the canvas appropriately.
Now look at the change function. The image object has a method called load that gives access to
the individual pixels that make up the image. This returns a two-dimensional array of RGB values.
For instance, if the pixel in the upper left corner of the image is pure white, then pix[0,0] will be
(255,255,255). If the next pixel to the right is pure black, pix[1,0] will be (0,0,0). To convert
the image to grayscale, for each pixel we take the average of its red, green, and blue components,
and reset the red, green, and blue components to all equal that average. Remember that if the red,
green, and blue are all the same, then the color is a shade of gray. After modifying all the pixels, we
create a new ImageTk object from the modified pixel data and display it on the canvas.
You can have a lot of fun with this. Try modifying the change function. For instance, if we use the
following line in the change function, we get an effect that looks like a photo negative:
pix[i,j] = (255-red, 255-green, 255-blue)
Note, though, that this way of manipulating images is the slow, manual way. PIL has a number of
much faster functions for modifying images. You can very easily change the brightness, hue, and
contrast of images, resize them, rotate them, and much more. See the PIL reference materials for
more on this.
18.2. THE PYTHON IMAGING LIBRARY 181
putdata If you are interested drawing mathematical objects like fractals, plotting points pixel-
by-pixel can be very slow in Python. One way to speed things up is to use the putdata method.
The way it works is you supply it with a list of RGB pixel values, and it will copy it into your image.
Here is a program that plots a 300 × 300 grid of random colors.
root = Tk()
canvas = Canvas(width=300, height=300)
canvas.grid()
image=Image.new(mode= 'RGB ',size=(300,300))
image.putdata(L)
photo=ImageTk.PhotoImage(image)
canvas.create_image(0,0,image=photo,anchor=NW)
mainloop()
ImageDraw The ImageDraw module gives another way to draw onto images. It can be used
to draw rectangles, circles, points, and more, just like Tkinter canvases, but it is faster. Here is a
short example that fills the image with a dark blue color and then 100 randomly distributed yellow
points.
root = Tk()
canvas = Canvas(width=300, height=300)
canvas.grid()
image=Image.new(mode= 'RGB ',size=(300,300))
draw = ImageDraw.Draw(image)
photo=ImageTk.PhotoImage(image)
canvas.create_image(0,0,image=photo,anchor=NW)
mainloop()
To use ImageDraw, we have to first create an ImageDraw object and tie it to the Image object. The
draw.rectangle method works similarly to the create_rectangle method of canvases, except
for a few differences with parentheses. The draw.point method is used to plot individual pixels.
A nice feature of it is we can pass a list of points instead of having to plot each thing in the list
separately. Passing a list is also much faster.
18.3 Pygame
Pygame is a library for creating two-dimensional games in Python. It can be used to can make
games at the level of old arcade or Nintendo games. It can be downloaded and easily installed
from www.pygame.org. There are a number of tutorials there to help you get started. I don’t know
a whole lot about Pygame, so I won’t cover it here, though perhaps in a later edition I will.
Part III
Intermediate Topics
183
Chapter 19
If L is a list and s is a string, then L[0] gives the first element of the list and s[0] the first element
of the string. If we want to change the first element of the list to 3, L[0]=3 will do it. But we cannot
change a string this way. The reason has to do with how Python treats lists and strings. Lists
(and dictionaries) are said to be mutable, which means their contents can change. Strings, on the
other hand, are immutable, which means they cannot be changed. The reason strings are immutable
is partly for performance (immutable objects are faster) and partly because strings are thought of
fundamental in the same way that numbers are. It makes some other aspects of the language easier
as well.
Making copies Another place that lists and strings differ is when we try to make copies. Consider
the following code:
s = 'Hello '
copy = s
s = s + '!!! '
print( 's is now: ', s, ' Copy: ', copy)
In the code above we make a copy of s and then change s. Everything works as we would intu-
itively expect. Now look at similar code with lists:
L = [1,2,3]
copy = L
185
186 CHAPTER 19. MISCELLANEOUS TOPICS III
L[0]=9
print( 'L is now: ', L, ' Copy: ', copy)
We can see that the list code did not work as we might have expected. When we changed L, the
copy got changed as well. As mentioned in Chapter 7, the proper way to make a copy of L is
copy=L[:]. The key to understanding these examples is references.
References Everything in Python is an object. This includes numbers, strings, and lists. When we
do a simple variable assignment, like x=487, what actually happens is Python creates an integer
object with the value 487, and the variable x acts as a reference to that object. It’s not that the value
4 is stored in a memory location named x, but rather that 487 is stored somewhere in memory,
and x points to that location. If we come along and declare y=487, then y also points to that same
memory location.
On the other hand, if we then come along and say x=721, what happens is we create a new integer
object with value 721 somewhere in memory and x now points at that. The 487 still exists in
memory where it was and it will stay there at least until there is nothing left pointing at it, at which
point its memory location will be free to use for something else.
All objects are treated the same way. When we set s='Hello', the string object Hello is some-
where in memory and s is a reference to it. When we then say copy=x, we are actually saying
that copy is another reference to 'Hello'. If we then do s=s+'!!!', what happens is a new
object 'Hello!!!' is created and because we assigned s to it, s is now a reference to that new
object, 'Hello!!!'. Remember that strings are immutable, so there is no changing 'Hello' into
something. Rather, Python creates a new object and points the variable s to it.
When we set L=[1,2,3], we create a list object [1,2,3] and a reference, L, to it. When we say
copy=L, we are making another reference to the object [1,2,3]. When we do L[0]=9, because
lists are mutable, the list [1,2,3] is changed, in place, to [9,2,3]. No new object is created.
The list [1,2,3] is now gone, and since copy is still pointing to the same location, it’s value is
[9,2,3].
On the other hand, if we instead use copy=L[:], we are actually creating a new list object some-
where else in memory so that there are two copies of [1,2,3] in memory. Then when we do
L[0]=9, we are only changing the thing that L points to, and copy still points to [1,2,3].
Just one further note to drive the point home. If we set x=487 and then set x=721, we are first
creating an integer object 487 and pointing x to it. When we then set x=721, we are creating a
new integer object 721 and pointing x to it. The net effect is that it seems like the “value” of x is
changing, but what is in fact changing is what x is pointing to.
Garbage collection Internally Python maintains a count of how many references there are to each
object. When the reference count of an object drops to 0, then the object is no longer needed, and
the memory it had been using becomes available again.
19.2. TUPLES 187
19.2 Tuples
A tuple is essentially an immutable list. Below is a list with three elements and a tuple with three
elements:
L = [1,2,3]
t = (1,2,3)
Tuples are enclosed in parentheses, though the parentheses are actually optional. Indexing and
slicing work the same as with lists. As with lists, you can get the length of the tuple by using the
len function, and, like lists, tuples have count and index methods. However, since a tuple is
immutable, it does not have any of the other methods that lists have, like sort or reverse, as
those change the list.
We have seen tuples in a few places already. For instance, fonts in Tkinter are specified as pairs, like
('Verdana',14), and sometimes as triples. The dictionary method items returns a list of tuples.
Also, when we use the following shortcut for exchanging the value of two or more variables, we
are actually using tuples:
a,b = b,a
One reason why there are both lists and tuples is that in some situations, you might want an im-
mutable type of list. For instance, lists cannot serve as keys in dictionaries because the values of
lists can change and it would be a nightmare for Python dictionaries to have to keep track of. Tu-
ples, however, can serve as keys in dictionaries. Here is an example assigning grades to teams of
students:
grades = {( 'John ', 'Ann '): 95, ( 'Mike ', 'Tazz '): 87}
Also, in situations where speed really matters, tuples are generally faster than lists. The flexibility
of lists comes with a corresponding cost in speed.
tuple To convert an object into a tuple, use tuple. The following example converts a list and
a string into tuples:
t1 = tuple([1,2,3])
t2 = tuple( 'abcde ')
Note The empty tuple is (). The way to get a tuple with one element is like this: (1,). Something
like (1) will not work because that just evaluates to 1 as in an ordinary calculation. For instance,
in the expression 2+(3*4), we don’t want the (3*4) to be a tuple, we want it to evaluate to a
number.
19.3 Sets
Python has a data type called a set. Sets work like mathematical sets. They are a lot like lists with
no repeats. Sets are denoted by curly braces, like below:
188 CHAPTER 19. MISCELLANEOUS TOPICS III
S = {1,2,3,4,5}
Recall that curly braces are also used to denote dictionaries, and {} is the empty dictionary. To get
the empty set, use the set function with no arguments, like this:
S = set()
This set function can also be used to convert things to sets. Here are two examples:
set([1,4,4,4,5,1,2,1,3])
set( 'this is a test ')
{1, 2, 3, 4, 5}
{ 'a' , ' ' , ' e' , ' i' , ' h' , ' s' , ' t ' }
Notice that Python will store the data in a set in whatever order it wants to, not necessarily the
order you specify. It’s the data in the set that matters, not the order of the data. This means that
indexing has no meaning for sets. You can’t do s[0], for instance.
Working with sets There are a few operators that work with sets.
Method Description
S.add(x) Add x to the set
S.remove(x) Remove x from the set
S.issubset(A) Returns True if S ⊂ A and False otherwise.
S.issuperset(A) Returns True if A ⊂ S and False otherwise.
Finally, we can do set comprehensions just like list comprehensions:
s = {i**2 for i in range(12)}
Example: removing repeated elements from lists We can use the fact that a set can have no
repeats to remove all repeats from a list. Here is an example:
19.4. UNICODE 189
L = [1,4,4,4,5,1,2,1,3]
L = list(set(L))
Example: wordplay Here is an example of an if statement that uses a set to see if every letter in a
word is either an a, b, c, d, or e:
if set(word).containedin( 'abcde '):
19.4 Unicode
It used to be computers could only display 255 different characters, called ASCII characters. In this
system, characters are allotted one byte of memory each, which gives 255 possible characters, each
with a corresponding numerical value. Characters 0 through 31 include various control characters,
including '\n' and '\t'. After that came some special symbols, then numbers, capital letters,
lowercase letters, and a few more symbols. Beyond that are a variety of other symbols, including
some international characters.
However, 255 characters is not nearly enough to represent all of the symbols used throughout the
alphabets of the world. The new standard is Unicode, which uses more than one byte to store
character data. Unicode currently supports over 65,000 characters. “Standard” isn’t quite the right
word here, as there are actually several standards in use, and this can cause some trouble. If you
need to work with unicode data, do some research into it first to learn about all the craziness.
In Unicode, characters still have numerical equivalents. If you would like to go back and forth
between a character and its numerical equivalent, use the chr and ord built-in functions. For
example, use ord('A') to get the numerical value of 'A', and use chr(65) to get the character
with numerical value 65. Here is a short example that prints out the first 1000 Unicode characters.
print( ''.join([chr(i) for i in range(1000)]))
Python supports Unicode, both in strings and in the names of variables, functions, etc. There are
some differences between Python 2 and Python 3 in support for Unicode.
190 CHAPTER 19. MISCELLANEOUS TOPICS III
19.5 sorted
First a definition: an iterable is an object that allows you to loop through its contents. Iterables
include lists, tuples, strings, and dictionaries. There are a number of Python methods that work on
any iterable.
The sorted function is a built-in function that can be used to sort an iterable. As a first example,
we can use it to sort a list. Say L=[3,1,2]. The following sets M to [1,2,3].
M = sorted(L)
The difference between sorted and L.sort is that L.sort() changes the original list L, but
sorted(L) does not.
The sorted function can be used on other iterables. The result is a sorted list. For instance,
sorted('xyzab') returns the list ['a','b','x','y','z']. If we really want the result to be
a string, we can use the join method:
s = ''.join(sorted( 'xyzab '))
This is a convenient operator that can be used to combine an if/else statement into a single line.
Here is an example:
x = 'a ' if y==4 else 'b '
This is equivalent to
if y==4:
x= 'a '
else:
x= 'b '
He scored 5 points.
He scored 1 point.
19.7 continue
The continue statement is a cousin of the break statement for loops. When a continue statement
is encountered in a for loop, the program ignores all the code in the loop beyond the continue
19.8. EVAL AND EXEC 191
statement and jumps back up to the start of the loop, advancing the loop counter as necessary.
Here is an example. The code on the right accomplishes the same thing as the code on the left.
for s in L: for s in L:
if s not in found: if s in found: continue
count+=1 count+=1
if s[0]== 'a ': if s[0]== 'a ':
count2+=1 count2+=1
The continue statement is something you can certainly do without, but you may see it from time
to time and it occasionally can make for simpler code.
The eval and exec functions allow a program to execute Python code while the program is run-
ning. The eval function is used for simple expressions, while exec can execute arbitrarily long
blocks of code.
eval We have seen eval many times before with input statements. One nice thing about using
eval with an input statement is that the user need not just enter a number. They can enter an
expression and Python will compute it. For instance, say we have the following:
num = eval(input( 'Enter a number: '))
The user can enter 3*(4+5) and Python will compute that expression. You can even use variables
in the expression.
This behaves like a spreadsheet COUNTIF function. It counts how many items in a list satisfy a
certain condition. What eval does for us here is allows the condition to be specified by the user
as a string. For instance, countif(L,'i>5') will return how many items in L are greater than 5.
Here is another common spreadsheet function:
def sumif(L, condition):
return sum([i for i in L if eval(condition)])
exec The exec function takes a string consisting of Python code and executes it. Here is an exam-
ple:
s = """x=3
for i in range(4):
print(i*x)"""
exec(s)
192 CHAPTER 19. MISCELLANEOUS TOPICS III
One nice use of the exec function is to let a program’s user define math functions to use while the
program is running. Here is the code to do that:
s = input( 'Enter function: ')
exec( 'def f(x): return ' + s)
I have used this code in a graph plotting program that allows users to enter equations to be graphed,
and I have used it in a program where the user can enter a function and the program will numeri-
cally approximate its roots.
You can use exec to have your program generate all sorts of Python code while it is running. This
allows your program to essentially modify itself while it is running.
Note In Python 2 exec is a statement, not a function, so you may see it used without parentheses
in older code.
Security issue The eval and exec functions can be dangerous. There is always the chance that
your users might input some code that could do something dangerous to the machine. They could
also use it to inspect the values of your variables (which could be bad if, for some reason, you were
storing passwords in a variable). So, you will want to be careful about using these functions in
code where security is important. One option for getting input without eval is to do something
like this:
num = int(input( 'Enter a number: '))
This assumes num is an integer. Use float or list or whatever is appropriate to the data you are
expecting.
The built-in enumerate function takes an iterable and returns a new iterable consisting of pairs
(i,x) where i is an index and x is the corresponding element from the iterable. For example:
s = 'abcde '
for (i,x) in enumerate(s):
print(i+1, x)
1 a
2 b
3 c
4 d
5 e
The object returned is something that is like a list of pairs, but not exactly. The following will give
a list of pairs:
list(enumerate(s))
19.10. COPY 193
The enumerate code can be shorter or clearer in some situations. Here is an example that returns
a list of the indices of all the ones in a string:
[j for (j,c) in enumerate(s) if c== '1 ']
zip The zip function takes two iterables and “zips” them up into a single iterable that contains
pairs (x,y), where x is from the first iterable, and y is from the second. Here is an example:
s = 'abc '
L = [10, 20, 30]
z = zip(s,L)
print(list(z))
Just like with enumerate, the result of zip is not quite a list, but if we do list(zip(s,L)), we
can get a list from it.
Here is an example that uses zip to create a dictionary from two lists.
L = [ 'one ', 'two ', 'three ']
M = [4, 9, 15]
d = dict(zip(L,M))
This technique can be used to create a dictionary while your program is running.
19.10 copy
The copy module has a couple of useful methods, copy and deepcopy. The copy method can be
used, for instance, to make a copy of an object from a user-defined class. As an example, suppose
we have a class called Users and we want to make a copy of a specific user u. We could do the
following:
from copy import copy
u_copy = copy(u)
But the copy method has certain limitations, as do other copying techniques, like M=L[:] for lists.
For example, suppose L = [1,2,3],[4,5,6]]. If we make a copy of this by doing M=L[:], and
then set L[0][0]=100, this will affect M[0][0] as well. This is because the copy is only a shallow
copy—the references to the sublists that make up L were copied, instead of copies of those sublists.
This sort of thing can be a problem anytime we are copying an object that itself consists of other
objects.
194 CHAPTER 19. MISCELLANEOUS TOPICS III
The deepcopy method is used in this type of situation to only copy the values and not the refer-
ences. Here is how it would work:
from copy import deepcopy
M = deepcopy(L)
There are a few more facts about strings that we haven’t yet talked about.
Here is an example where we use translate to implement a simple substitution cipher. A sub-
stitution cipher is a simple way to encrypt a message, where each letter is replaced by a different
letter. For instance, maybe every a is replaced by a g, and every b by an x, etc. Here is the code:
abcdefghijklmnopqrstuvwxyz
qjdpaztxghuflicornkesyvmwb
exgk gk q kadnae
this is a secret
The way it works is we first create the encryption key, which says which letter a gets replaced
with, b gets replaced with, etc. This is done by shuffling the alphabet. We then create a translation
19.12. MISCELLANEOUS TIPS AND TRICKS 195
table for both encoding and decoding, using the zip trick of Section 19.9 for creating dictionaries.
Finally, we use the translate method to do the actual substituting.
partition The partition method is similar to the list split method. The difference is illus-
trated below:
'3.14159 '.partition( '. ')
'3.14159 '.split( '. ')
The difference is that the argument to the function is returned as part of the output. The partition
method also returns a tuple instead of a list. Here is an example that calculates the derivative
a simple monomial entered as a string. The rule for derivatives is that the derivative of ax n is
nax n−1 .
Note These methods, and many others, could be done directly just using the basic tools of the
language like for loops, if statements, etc. The idea, though, is that those things that are commonly
done are made into methods or classes that are part of the standard Python distribution. This can
help you from having to reinvent the wheel and they can also make your programs more reliable
and easier to read.
Comparing strings Comparison of strings is done alphabetically. For example, the following will
print Yes.
if 'that ' < 'this ':
print( 'Yes ')
Beyond that, if the string contains characters other than letters, the comparison is based off the ord
value of the characters.
Statements on the same line You can write an if statement and the statement that goes with it on
the same line.
if x==3: print( 'Hello ')
You can also combine several statements on a line if you separate them by semicolons. For exam-
ple:
a=3; b=4; c=5
Don’t overuse either of these, as they can make your code harder to read. Sometimes, though, they
can make it easier to read.
Calling multiple methods You can call several methods in a row, like below:
s = open( 'file.txt ').read().upper()
This example reads the contents of a file, then converts everything to uppercase, and stores the
result in s. Again, be careful not to overdo it with too many methods in a row or your code may be
difficult to read.
None In addition to int, float, str, list, etc., Python has a data type called None. It basically
is the Python version of nothing. It indicates that there is nothing when you might have expected
there to be something, such as the return value of a function. You may see it show up here and
there.
Documentation strings When defining a function, you can specify a string that contains infor-
mation about how the function works. Then anyone who uses the function can use Python’s help
function to get information about your function. Here an example:
def square(x):
""" Returns x squared. """
return x**2
>>> help(square)
Help on function square in module __main__:
square(x)
Returns x squared.
You can also use documentation strings right after a class statement to provide information about
your class.
Your Python programs can be run on other computers that have Python installed. Macs and Linux
machines usually have Python installed, though the version may not be up to date with the one
19.13. RUNNING YOUR PYTHON PROGRAMS ON OTHER COMPUTERS 197
you are using, and those machines may not have additional libraries you are using.
An option on Windows is py2exe. This is a third-party module that converts Python programs to
executables. As of now, it is only available for Python 2. It can be a little tricky to use. Here is a
script that you can use once you have py2exe installed.
import os
program_name = raw_input( 'Enter name of program: ')
if program_name[-3:]!= '.py ':
program_name+= '.py '
If everything works, a window should pop up and you’ll see a bunch of stuff happening quickly.
The resulting executable file will show up in a new subdirectory of the directory your Python file
is in, called dist. There will be a few other files in that subdirectory that you will need to include
with your executable.
198 CHAPTER 19. MISCELLANEOUS TOPICS III
Chapter 20
Useful modules
Python comes with hundreds of modules that do all sorts of things. There are also many third-
party modules available for download from the internet. This chapter discusses a few modules
that I have found useful.
There are a couple of different ways to import modules. Here are several ways to import some
functions from the Random module.
from random import randint, choice
from random import *
import random
1. The first way imports just two functions from the module.
2. The second way imports every function from the module. You should usually avoid do-
ing this, as the module may contain some names that will interfere with your own variable
names. For instance if your program uses a variable called total and you import a module
that contains a function called total, there can be problems. Some modules, however, like
tkinter, are fairly safe to import this way.
3. The third way imports an entire module in a way that will not interfere with your variable
names. To use a function from the module, preface it with random followed by a dot. For
instance: random.randint(1,10).
Changing module names The as keyword can be used to change the name that your program
uses to refer to a module or things from a module. Here are three examples:
import numpy as np
199
200 CHAPTER 20. USEFUL MODULES
Location Usually, import statements go at the beginning of the program, but there is no restric-
tion. They can go anywhere as long as they come before the code that uses the module.
Getting help To get help on a module (say the random module) at the Python shell, import it
using the third way above. Then dir(random) gives a list of the functions and variables in the
module, and help(random) will give you a rather long description of what everything does. To
get help on a specific function, like randint, type help(random.randint).
The time module has some useful functions for dealing with time.
sleep The sleep function pauses your program for a specified amount of time (in seconds).
For instance, to pause your program for 2 seconds or for 50 milliseconds, use the following:
sleep(2)
sleep(.05)
Timing things The time function can be used to time things. Here is an example:
from time import time
start = time()
# do some stuff
print( 'It took ', round(time()-start, 3), 'seconds. ')
For another example, see Section 17.6, which shows how to put a countdown timer into a GUI.
The resolution of the time() function is milliseconds on Windows and microseconds on Linux.
The above example uses whole seconds. If you want millisecond resolution, use the following
print statement:
print( '{:.3f} seconds '.format(time()-start))
You can use a little math on this to get minutes and hours. Here is an example:
t = time()-start
secs = t%60
mins = t//60
hours = mins//60
By the way, when you call time(), you get a rather strange value like 1306372108.045. It is the
number of seconds elapsed since January 1, 1970.
20.2. DATES AND TIMES 201
Dates The module datetime allows us to work with dates and times together. The following
line creates a datetime object that contains the current date and time:
from datetime import datetime
d = datetime(1,1,1).now()
The datetime object has attributes year, month, day, hour, minute, second, and microsecond.
Here is a short example:
d = datetime(1,1,1).now()
print( '{}:{:02d} {}/{}/{} '.format(d.hour,d.minute,d.month,d.day,d.year))
7:33 2/1/2011
The hour is in 24-hour format. To get 12-hour format, you can do the following:
am_pm = 'am ' if d.hour<12 else 'pm '
print( '{}:{}{} '.format(d.hour%12, d.minute, am_pm))
An alternative way to display the date and time is to use the strftime method. It uses a variety
of formatting codes that allow you to display the date and time, including information about the
day of the week, am/pm, etc.
Code Description
%c date and time formatted according to local conventions
%x, %X %x is the date, and %X is the time, both formatted as with %c
%d day of the month
%j day of the year
%a, %A weekday name (%a is the abbreviated weekday name)
%m month (01-12)
%b, %B month name (%b is the abbreviated month name)
%y, %Y year (%y is 2-digit, %Y is 4-digit)
%H, %I hour (%H is 24-hour, %I is 12-hour)
%p am or pm
%M minute
%S second
Here is an example:
print(d.strftime( '%A %x '))
Tuesday 02/01/11
02/01/11 07:33:14
07AM on February 01
The leading zeros are a little annoying. You could combine strftime with the first way we learned
to get nicer output:
print(d.strftime( '{}%p on %B {} ').format(d.hour%12, d.day))
7AM on February 1
You can also create a datetime object. When doing so, you must specify the year, month, and day.
The other attributes are optional. Here is an example:
d = datetime(2011, 2, 1, 7, 33)
e = datetime(2011, 2, 1)
You can compare datetime objects using the <, >, ==, and != operators. You can also do arithmetic
on datetime objects, though we won’t cover it here. In fact, there is a lot more you can do with
dates and times.
Another nice module is calendar which you can use to print out calendars and do more sophisti-
cated calculations with dates.
The os module and the submodule os.path contain functions for working with files and directo-
ries.
Changing the directory When your program opens a file, the file is assumed to be in the same
directory as your program itself. If not, you have to specify the directory, like below:
s = open( 'c:/users/heinold/desktop/file.txt ').read()
If you have a lot of files that you need to read, all in the same directory, you can use os.chdir to
change the directory. Here is an example:
os.chdir( 'c:/users/heinold/desktop/ ')
s = open( 'file.txt ').read()
Getting the current directory The function getcwd returns the path of current directory. It will
be the directory your program is in or the directory you changed it to with chdir.
Getting the files in a directory The function listdir returns a list of the entries in a directory,
including all files and subdirectories. If you just want the files and not the subdirectories or vice-
versa, the os.path module contains the functions isfile and isdir to tell if an entry is a file or a
20.3. WORKING WITH FILES AND DIRECTORIES 203
directory. Here is an example that searches through all the files in a directory and prints the names
of those files that contain the word 'hello'.
import os
directory = 'c:/users/heinold/desktop/ '
files = os.listdir(directory)
for f in files:
if os.path.isfile(directory+f):
s = open(directory+f).read()
if 'hello ' in s:
print(f)
Changing and deleting files Here are a few useful functions. Just be careful here.
Function Description
mkdir create a directory
rmdir remove a directory
remove delete a file
rename rename a file
The first two functions take a directory path as their only argument. The remove function takes a
single file name. The first argument of rename is the old name and the second argument is the new
name.
Copying files There is no function in the os module to copy files. Instead, use the copy function
in the shutil module. Here is an example that takes all the files in a directory and makes a copy
of each, with each copied file’s name starting with Copy of :
import os
import shutil
directory = 'c:/users/heinold/desktop/ '
files = os.listdir(directory)
for f in files:
if os.path.isfile(directory+f):
shutil.copy(directory+f, directory+ 'Copy of '+f)
More with os.path The os.path module contains several more functions that are helpful for
working with files and directories. Different operating systems have different conventions for how
they handle paths, and the functions in os.path allow your program to work with different op-
erating systems without having to worry about the specifics of each one. Here are some examples
(on my Windows system):
print(os.path.split( 'c:/users/heinold/desktop/file.txt '))
print(os.path.basename( 'c:/users/heinold/desktop/file.txt '))
print(os.path.dirname( 'c:/users/heinold/desktop/file.txt '))
204 CHAPTER 20. USEFUL MODULES
('c:/users/heinold/desktop', 'file.txt')
file.txt
c:/users/heinold/desktop
directory\\file.txt
Note that the standard separator in Windows is the backslash. The forward slash also works.
Finally, two other functions you might find helpful are the exists function, which tests if a file
or directory exists, and getsize, which gets the size of a file. There are many other functions in
os.path. See the Python documentation [1] for more information.
os.walk The os.walk function allows you to scan through a directory and all of its subdirec-
tories. Here is a simple example that finds all the Python files on my desktop or in subdirectories
of my desktop:
for (path, dirs, files) in os.walk( 'c:/users/heinold/desktop/ '):
for filename in files:
if filename[-3:]== '.py ':
print(filename)
Running programs There are a few different ways for your program to run another program.
One of them uses the system function in the os module. Here is an example:
import os
os.chdir( 'c:/users/heinold/desktop ')
os.system( 'file.exe ')
The system function can be used to run commands that you can run at a command prompt. An-
other way to run your programs is to use the execv function.
Quitting your program The sys module has a function called exit that can be used to quit your
program. Here is a simple example:
import sys
ans = input( 'Quit the program? ')
if ans.lower() == 'yes '
sys.exit()
A zip file is a compressed file or directory of files. The following code extracts all the files from a
zip file, filename.zip, to my desktop:
20.6. GETTING FILES FROM THE INTERNET 205
import zipfile
z = zipfile.ZipFile( 'filename.zip ')
z.extractall( 'c:/users/heinold/desktop/)
For getting files from the internet there is the urllib module. Here is a simple example:
The urlopen function returns an object that is a lot like a file object. In the example above, we use
the read() and decode() methods to read the entire contents of the page into a string s.
The string s in the example above is filled with the text of an HTML file, which is not pretty to read.
There are modules in Python for parsing HTML, but we will not cover them here. The code above
is useful for downloading ordinary text files of data from the internet.
For anything more sophisticated than this, consider using the third party requests library.
20.7 Sound
An easy way to get some simple sounds in your program is to use the winsound module. It only
works with Windows, however. One function in winsound is Beep which can be used to play a
tone at a given frequency for a given amount of time. Here is an example that plays a sound of 500
Hz for 1 second.
The first argument to Beep is the frequency in Hertz and the second is the duration in milliseconds.
Another function in winsound is PlaySound, which can be used to play WAV files. Here is an
example:
On the other hand, If you have Pygame installed, it is pretty easy to play any type of common
sound file. This is shown below, and it works on systems other than Windows:
import pygame
pygame.mixer.init(18000,-16,2,1024)
sound = pygame.mixer.Sound( 'soundfile.wav ')
sound.play()
206 CHAPTER 20. USEFUL MODULES
Creating your own modules is easy. Just write your Python code and save it in a file. You can then
import your module using the import statement.
Chapter 21
Regular expressions
The replace method of strings is used to replace all occurrences of one string with another, and
the index method is used to find the first occurrence of a substring in a string. But sometimes you
need to do a more a sophisticated search or replace. For example, you may need to find all of the
occurrences of a string instead of just the first one. Or maybe you want to find all occurrences of
two letters followed by a number. Or perhaps you need to replace every 'qu' that is at the start
of a word with 'Qu'. This is what regular expressions are for. Utilities for working with regular
expressions are found in the re module.
There is some syntax to learn in order to understand regular expressions. Here is one example to
give you an idea of how they work:
import re
print(re.sub(r '([LRUD])(\d+) ', '*** ', 'Locations L3 and D22 full. '))
This example replaces any occurrence of an L, R, U, or D followed by one or more digits with
'***'.
21.1 Introduction
This searches through string for pattern and replaces anything matching that pattern with the
string replacement. All of the upcoming examples will be shown with sub, but there are other
things we can do with regular expressions besides substituting. We will get to those after discussing
the syntax of regular expressions.
207
208 CHAPTER 21. REGULAR EXPRESSIONS
Raw strings A lot of the patterns use backslashes. However, backslashes in strings are used for
escape characters, like the newline, \n. To get a backslash in a string, we need to do \\. This
can quickly clutter up a regular expression. To avoid this, our patterns will be raw strings, where
backslashes can appear as is and don’t do anything special. To mark a string as a raw string, preface
it with an r like below:
s = r 'This is a raw string. Backslashes do not do anything special. '
21.2 Syntax
Basic example We start with a regular expression that mimics the replace method of strings.
Here is a example of using replace to replace all occurrences of abc with *:
'abcdef abcxyz '.replace( 'abc ', '* ')
*def *xyz
Here is the regular expression code that does the same thing:
re.sub(r 'abc ', '* ', 'abcdef abcxyz ')
Square brackets We can use square brackets to indicate that we only want to match certain letters.
Here is an example where we replace every a and d with asterisks:
re.sub(r '[ad] ', '* ', 'abcdef ')
*bc*ef
Here is another example, where an asterisk replaces all occurrences of an a, b, or c that is followed
by a 1, 2, or 3:
re.sub(r '[abc][123] ', '* ', 'a1 + b2 + c5 + x2 ')
* + * + c5 + x2
We can give ranges of values—for example, [a-j] for the letters a through j. Here are some
further examples of ranges:
Range Description
[A-Z] any capital letter
[0-9] any digit
[A-Za-z0-9] any letter or digit
A slightly shorter way to match any digit is \d, instead of [0-9].
Matching any character Use a dot to match (almost) any character. Here is an example:
re.sub(r 'A.B ', '* ', 'A2B AxB AxxB A$B ')
21.2. SYNTAX 209
* * AxxB *
Exception: The one character not matched by the dot is the newline character. If you need that to
be matched, too, put ?s at the start of your pattern.
*C *C AC
We use the + character to indicate that we want to match one or more B’s here. There are similar
things we can use to specify different numbers of B’s here. For instance, using * in place of + will
match zero or more B’s. (This means that AC in the example above would be replaced by *C because
A counts as an A followed by zero B’s.) Here is a table of what you can do:
Code Description
+ match 1 or more occurrences
* match 0 or more occurrences
? match 0 or 1 occurrence
{m} match exactly m occurrences
{m,n} match between m and n occurrences, inclusive
Here is an example that matches an A followed by three to six B’s:
re.sub(r 'AB{3,6} ', '* ', 'ABB ABBB ABBBB ABBBBBBBBB ')
'ABB * * *BBB'
Here, we do not match ABB because the A is only followed by two B’s. The next two pieces get
matched, as the A is followed by three B’s in the second term, and four B’s in the third. In the last
piece, there is an A followed by nine B’s. What gets matched isthe A along with the first six B’s.
Note that the matching in the last piece above is greedy; that is, it takes as many B’s as it is allowed.
It is allowed to take between three and six B’s, and it takes all six. To get the opposite behavior, to
get it take as few B’s as allowed, add a ?, like below:
re.sub(r 'AB{3,6}? ', '* ', 'ABB ABBB ABBBB ABBBBBBBBB ')
'ABB * * *BBBBBB'
The ? can go after any of the numeric specifiers, like +?, -?, ??, etc.
'*def*123*'
In the above example, every time we encounter an abc or an xyz, we replace it with an asterisk.
Matching only at the start or end Sometimes you don’t want to match every occurrence of some-
thing, maybe just the first or the last occurrence. To match just the first occurrence of something,
start the pattern off with the ^ character. To match just the last occurrence, end the pattern with the
$ character. Here are some examples:
re.sub( '^abc ', '* ', 'abcdefgabc ')
re.sub( 'abc$ ', '* ', 'abcdefgabc ')
*defgabc
abcdefg*
Escaping special characters We have seen that + and * have special meanings. What if we need
to match a plus sign? To do so, use the backslash to escape it, like \+. Here is an example:
re.sub(r 'AB\+ ', '* ', 'AB+C ')
*C
Just a note again about raw strings—if we didn’t use them for the patterns, every backslash would
have to be doubled. For instance, r'AB\+' would have to be 'AB\\+.
Backslash sequences
* + ** = **
3***14***17
• \w matches any letter or number, and \W matches anything else. Here is an example:
re.sub(r '\w ', '* ', 'This is a test. Or is it? ')
re.sub(r '\W ', '* ', 'This is a test. Or is it? ')
'This*is*a*test.**Or*is*it?'
'**** ** * ***** ** ** ***'
Preceding and following matches Sometimes you want to match things if they are preceded or
followed by something.
Code Description
(?=) matches only if followed by
(?!) matches only if not followed by
(?<=) matches only if preceded by
(?<!) matches only if not preceded by
Here is an example that matched the word the only if it is followed by cat:
re.sub(r 'the(?= cat) ', '* ', 'the dog and the cat ')
Here is an example that matches the word the only if it is preceded by a space:
re.sub(r '(?<= )the ', '* ', 'Athens is the capital. ')
Athens is * capital.
The following example will match the word the only if it neither preceded by and nor followed by
letters, so you can use it to replace occurrences of the word the, but not occurrences of the within
other words.
re.sub(r '(?<!\w)[Tt]he(?!\w) ', '* ', 'The cat is on the lathe there. ')
Flags There are a few flags that you can use to affect the behavior of a regular expression. We
look at a few of them here.
* *
• (?s) — Recall the . character matches any character except a newline. This flag makes it
match newline characters, too.
212 CHAPTER 21. REGULAR EXPRESSIONS
• (?x) — Regular expressions can be long and complicated. This flag allows you to use a more
verbose, multi-line format, where whitespace is ignored. You can also put comments in. Here
is an example:
* and *
21.3 Summary
Expression Description
[] any of the characters inside the brackets
. any character except the newline
+ 1 or more of the preceding
* 0 or more of the preceding
? 0 or 1 of the preceding
{m} exactly m of the preceding
{m,n} between m and n (inclusive) of the preceding
? following +, *, ?, {m}, and {m,n} — take as few as possible
\ escape special characters
| “or”
^ (at start of pattern) match just the first occurrence
$ (at end of pattern) match just the last occurrence
\d \D any digit (non-digit)
\w \W any letter or number (non-letter or -number)
\s \S any whitespace (non-whitespace)
(?=) only if followed by
(?!) only if not followed by
(?<=) only if preceded by
(?<!) only if not preceded by
(?i) flag to ignore case
(?s) flag to make the . match newlines, too
(?x) flag to enable verbose style
21.3. SUMMARY 213
Expression Description
'abc' the exact string abc
'[ABC]' an A, B, or C
'[a-zA-Z][0-9]' match a letter followed by a digit
'[a..]' a followed by any two characters (except newlines)
'a+' one or more a’s
'a*' any number of a’s, even none
'a?' zero or one a
'a{2}' exactly two a’s
'a{2,4}' two, three, or four a’s
'a+?' one or more a’s taking as few as possible
'a\.' a followed by a period
'ab|zy' an ab or a zy
'^a' first a
'a$' last a
'\d' every digit
'\w' every letter or number
'\s' every whitespace
'\D' everything except digits
'\W' everything except letters and numbers
'\S' everything except whitespace
'a(?=b)' every a followed by a b
'a(?!b)' every a not followed by a b
'(?<=b)a' every a preceded by a b
'(?<!b)a' every a not preceded by a b
Note Note that in all of the examples in this chapter, we are dealing with non-overlapping patterns.
For instance, if we look for the pattern 'aba' in the string 'abababa', we see there are several
overlapping matches. All of our matching is done from the left and does not consider overlaps. For
instance, we have the following:
'*b* '
214 CHAPTER 21. REGULAR EXPRESSIONS
21.4 Groups
Using parentheses around part of an expression creates a group that contains the text that matches a
pattern. You can use this to do more sophisticated substitutions. Here is an example that converts
to lowercase every capital letter that is followed by a lowercase letter:
def modify(match):
letters = match.group()
return letters.lower()
re.sub(r '([A-Z])[a-z] ', modify, 'PEACH Apple ApriCot ')
The modify function ends up getting called three times, one for each time a match occurs. The
re.sub function automatically sends to the modify function a Match object, which we name
match. This object contains information about the matching text. The object’s group method
returns the matching text itself.
If instead of match.group, we use match.groups, then we can further break down the match
according the groups defined by the parentheses. Here is an example that matches a capital letter
followed by a digit and converts each letter to lowercase and adds 10 to each number:
def modify(match):
letter, number = match.groups()
return letter.lower() + str(int(number)+10)
re.sub(r '([A-Z])(\d) ', modify, 'A1 + B2 + C7 ')
The groups method returns the matching text as tuples. For instance, in the above program the
tuples returned are shown below:
Note also that we can get at this information by passing arguments to match.group. For the first
match, match.group(1) is 'A' and match.group(2) is 1.
'*b*bababa'
21.5. OTHER FUNCTIONS 215
• findall — The findall function returns a list of all the matches found. Here is an exam-
ple:
re.findall(r '[AB]\d ', 'A3 + B2 + A9 ')
• split — The split function is analogous to the string method split. The regular expres-
sion version allows us to split on something more general than the string method does. Here
is an example that splits an algebraic expression at + or -.
re.split(r '\+|\- ', '3x+4y-12x^2+7 ')
• match and search — These are useful if you just want to know if a match occurs. The
difference between these two functions is match only checks to see if the beginning of the
string matches the pattern, while search searches through the string until it finds a match.
Both return None if they fail to find a match and a Match object if they do find a match. Here
are examples:
if (re.match(r 'ZZZ ', 'abc ZZZ xyz ')):
print( 'Match found at beginning. ')
else:
print( 'No match at beginning ')
No match at beginning.
Match found in string.
The Match object returned by these functions has group information in it. Say we have the
following:
a=re.search(r '([ABC])(\d) ', '= A3+B2+C8 ')
a.group()
a.group(1)
a.group(2)
'A3'
'A '
'3 '
Remember that re.search will only report on the first match it finds in the string.
216 CHAPTER 21. REGULAR EXPRESSIONS
• finditer — This returns an iterator of Match objects that we can loop through, like below:
for s in re.finditer(r '([AB])(\d) ', 'A3+B4 '):
print(s.group(1))
A
B
Note that this is a little more general than the findall function in that findall returns the
matching strings, whereas finditer returns something like a list of Match objects, which
give us access to group information.
• compile — If you are going to be reusing the same pattern, you can save a little time by first
compiling the pattern, as shown below:
pattern = re.compile(r '[AB]\d ')
pattern.sub( '* ', 'A3 + B4 ')
pattern.sub( 'x ', 'A8 + B9 ')
* + *
x + x
When you compile an expression, for many of the methods you can specify optional starting
and ending indices in the string. Here is an example:
pattern = re.compile(r '[AB]\d ')
pattern.findall( 'A3+B4+C9+D8 ',2,6)
['B4']
21.6 Examples
Roman Numerals Here we use regular expressions to convert Roman numerals into ordinary
numbers.
import re
d = { 'M ':1000, 'CM ':900, 'D ':500, 'CD ':400, 'C ':100, 'XC ':90,
'L ':50, 'XL ':40, 'X ':10, 'IX ':9, 'V ':5, 'IV ':4, 'I ':1}
pattern = re.compile(r"""(?x)
(M{0,3})(CM)?
(CD)?(D)?(C{0,3})
(XC)?(XL)?(L)?(X{0,3})
(IX)?(IV)?(V)?(I{0,3})""")
sum = 0
for x in m.groups():
21.6. EXAMPLES 217
The regular expression itself is fairly straightforward. It looks for up to three M’s, followed by zero
or one CM’s, followed by zero or one CD’s, etc., and stores each of those in a group. The for loop
then reads through those groups and uses a dictionary to add the appropriate values to a running
sum.
Dates Here we use a regular expression to take a date in a verbose format, like February 6, 2011,
and convert it an abbreviated format, mm/dd/yy (with no leading zeroes). Rather than depend
on the user to enter the date in exactly the right way, we can use a regular expression to allow for
variation and mistakes. For instance, this program will work whether the user spells out the whole
month name or abbreviates it (with or with a period). Capitalization does not matter, and it also
does not matter if they can even spell the month name correctly. They just have to get the first three
letters correct. It also does not matter how much space they use and whether or not they use a
comma after the day.
import re
d = { 'jan ': '1 ', 'feb ': '2 ', 'mar ': '3 ', 'apr ': '4 ',
'may ': '5 ', 'jun ': '6 ', 'jul ': '7 ', 'aug ': '8 ',
'sep ': '9 ', 'oct ': '10 ', 'nov ': '11 ', 'dec ': '12 '}
The first part of the regular expression, ([A-Za-z]+)\.? takes care of the month name. It matches
however many letters the user gives for the month name. The \.? matches either 0 or 1 periods
after the month name, so the user can either enter a period or not. The parentheses around the
letters save the result in group 1.
Next we find the day: \s*(\d{1,2}). The first part \s* matches zero or more whitespace char-
acters, so it doesn’t matter how much space the user puts between the month and day. The rest
matches one or two digits and saves it in group 2.
218 CHAPTER 21. REGULAR EXPRESSIONS
The rest of the expression, ,?\s*(\d{4}), finds the year. Then we use the results to create the
abbreviated date. The only tricky part here is the first part, which takes the month name, changes
it to all lowercase, and takes only the first three characters and uses those as keys to a dictionary.
This way, as long as the user correctly spells the first three letters of the month name, the program
will understand it.
Chapter 22
Math
This chapter is a collection of topics that are at somewhat mathematical in nature, though many of
them are of general interest.
As mentioned in Section 3.5, the math module contains some common math functions. Here are
most of them:
Function Description
sin, cos, tan trig functions
asin, acos, atan inverse trig functions
atan2(y,x) gives arctan( y/x) with proper sign behavior
sinh, cosh, tanh hyperbolic functions
asinh, acosh, atanh inverse hyperbolic functions
log, log10 natural log, log base 10
log1p log(1+x), more accurate near 1 than log
exp exponential function e x
degrees, radians convert from radians to degrees or vice-versa
floor floor(x) is the greatest integer ≤ x
ceil ceil(x) is the least integer ≥ x
e, pi the constants e and π
factorial factorial
modf returns a pair (fractional part, integer part)
gamma, erf the Γ function and the Error function
219
220 CHAPTER 22. MATH
Note Note that the floor and int functions behave the same for positive numbers, but differ-
ently for negative numbers. For instance, floor(3.57) and int(3.57) both return the inte-
ger 3. However, floor(-3.57) returns -4, the greatest integer less than or equal to -3.57, while
int(-3.57) returns -3, which is obtained by throwing away the decimal part of the number.
atan2 The atan2 function is useful for telling the angle between two points. Let x and y be
the distances between the points in the x and y directions. The tangent of the angle in the picture
below is given by y/x . Then arctan( y/x) gives the angle itself.
But if the angle were 90◦ , this would not work as x would be 0. We would also need special cases
to handle when x < 0 and when y < 0. The atan2 function handles all of this. Doing atan2(y,x)
will return arctan( y/x) with all of the cases handled properly.
The atan2 function returns the result in radians. If you would like degrees, do the following:
angle = math.degrees(atan2(y,x))
The resulting angle from atan2 is between −π and π (−180◦ and 180◦ ). If you would like it between
0 and 360◦ , do the following:
angle = math.degrees(atan2(y,x))
angle = (angle+360) % 360
100.1**10
1.0100451202102516e+20
The resulting value is displayed in scientific notation. It is 1.0100451202102516 × 1020 . The e+20
stands for ×1020 . Here is another example:
.15**10
5.7665039062499975e-09
In Section 3.1 we saw that some numbers, like .1, are not represented exactly on a computer. Math-
ematically, after the code below executes, x should be 1, but because of accumulated errors, it is
actually 0.9999999999999999.
x = 0
for i in range(10):
x+=.1
This means that the following if statement will turn out False:
if x==1:
A more reliable way to compare floating point numbers x and y is to check to see if the difference
between the two numbers is sufficiently small, like below:
if abs(x-y)<10e-12:
22.4 Fractions
There is a module called fractions for working with fractions. Here is a simple example of it in
action:
from fractions import Fraction
r = Fraction(3, 4)
s = Fraction(1, 4)
print(r+s)
Fraction(1, 1)
You can do basic arithmetic with Fraction objects, as well as compare them, take their absolute
values, etc. Here are some further examples:
r = Fraction(3, 4)
s = Fraction(2, 8)
print(s)
print(abs(2*r-3))
if r>s:
print( 'r is larger ')
Fraction(1,4)
Fraction(3,2)
r is larger
Note that Fraction automatically converts things to lowest terms. The example below shows how
to get the numerator and denominator:
r = Fraction(3,4)
r.numerator
r.denominator
222 CHAPTER 22. MATH
3
4
Converting to and from floats To convert a fraction to a floating point number, use float, like
below:
float(Fraction(1, 8))
0.125
On the other hand, say we want to convert 0.3 to a fraction. Unfortunately, we should not do
Fraction(.3) because, as mentioned, some numbers, including .3, are not represented exactly
on the computer. In fact, Fraction(.3) returns the following Fraction object:
Fraction(5404319552844595, 18014398509481984)
Limiting the denominator One useful method is limit_denominator. For a given Fraction
object, limit_denominator(x) finds the closest fraction to that value whose denominator does
not exceed x . Here is some examples:
Fraction( '.333 ').limit_denominator(100)
Fraction( '.333 ').limit_denominator(1000)
Fraction( '3.14159 ').limit_denominator(1000)
Fraction(1, 3)
Fraction(333, 1000)
Fraction(355, 113)
The last example returns a pretty close fractional approximation to π. It is off by less than 0.0000003.
Greatest common divisor The fractions module contains a useful function called gcd that
returns the greatest common divisor of two numbers. Here is an example:
from fractions import gcd
print( 'The largest factor 35 and 21 have in common is ', gcd(35, 21))
Python has a module called decimal for doing exact calculations with decimal numbers. As we’ve
noted a few times now, some numbers, such as .3, cannot be represented exactly as a float. Here is
22.5. THE DECIMAL MODULE 223
Decimal('0.3')
The string here is important. If we leave it out, we get a decimal that corresponds with the inexact
floating point representation of .3:
Decimal(.3)
Decimal('0.29999999999999998889776975374843459576368
3319091796875')
Math You can use the usual math operators to work with Decimal objects. For example:
Decimal(.34) + Decimal(.17)
Decimal('0.51')
Decimal('0.05882352941176470588235294118')
The mathematical functions exp, ln, log10, and sqrt are methods of decimal objects. For in-
stance, the following gives the square root of 2:
Decimal(2).sqrt()
Decimal('1.414213562373095048801688724')
Decimal objects can also be used with the built in max, min, and sum functions, as well as converted
to floats with float and strings with str.
Precision By default Decimal objects have 28-digit precision. To change it to, say, five digit-
precision, use the getcontext function.
from decimal import getcontext
getcontext().prec = 5
p
Here is an example that prints out 100 digits of 2:
getcontext().prec = 100
Decimal(2).sqrt()
Decimal('1.414213562373095048801688724209698078569671875
376948073176679737990732478462107038850387534327641573')
224 CHAPTER 22. MATH
There is theoretically no limit to the precision you can use, but the higher the precision, the more
memory is required and the slower things will run. In general, even with small precisions, Decimal
objects are slower than floating point numbers. Ordinary floating point arithmetic is enough for
most problems, but it is nice to know that you can get higher precision if you ever need it.
There is a lot more to the decimal module. See the Python documentation [1].
Complex numbers have methods real() and imag() which return the real and imaginary parts
of the number. The conjugate method returns the complex conjugate (the conjugate of a + bi is
a − bi ).
The cmath module contains many of the same functions as the math module, except that they work
with complex arguments. The functions include regular, inverse, and hyperbolic trigonometric
functions, logarithms and the exponential function. It also contains two functions, polar and
rect, for converting between rectangular and polar coordinates:
cmath.polar(3j)
cmath.rect(3.0, 1.5707963267948966)
(3.0, 1.5707963267948966)
(1.8369701987210297e-16+3j)
Complex numbers are fascinating, though not all that useful in day-to-day life. One nice applica-
tion, however, is fractals. Here is a program that draws the famous Mandelbrot set. The program
requires the PIL and Python 2.6 or 2.7.
max_iter=75
22.6. COMPLEX NUMBERS 225
xtrans=-.5
ytrans=0
xzoom=150
yzoom=-150
root = Tk()
canvas = Canvas(width=300, height=300)
canvas.grid()
image=Image.new(mode= 'RGB ',size=(300,300))
draw = ImageDraw.Draw(image)
for x in range(300):
c_x = (x-150)/float(xzoom)+xtrans
for y in range(300):
c = complex(c_x, (y-150)/float(yzoom)+ytrans)
count=0
z=0j
while abs(z)<2 and count<max_iter:
z = z*z+c
count += 1
draw.point((x,y),
fill=color_convert(count+25,count+25,count+25))
canvas.delete(ALL)
photo=ImageTk.PhotoImage(image)
canvas.create_image(0,0,image=photo,anchor=NW)
canvas.update()
mainloop()
The code here runs very slowly. There are ways to speed it up somewhat, but Python is unfortu-
nately slow for these kinds of things.
226 CHAPTER 22. MATH
Sparse lists A 10,000,000×10,000,000 list of integers requires several hundred terabytes of storage,
far more than most hard drives can store. However, in many practical applications, most of the
entries of a list are 0. This allows us to save a lot of memory by just storing the nonzero values along
with their locations. We can do this using a dictionary whose keys are the locations of the nonzero
elements and whose values are the values stored in the array at those locations. For example,
suppose we have a two-dimensional list L whose entries are all zero except that L[10][12] is 47
and L[100][245] is 18. Here is a the dictionary that we would use:
d = {(10,12): 47, (100,245): 18}
The array module Python has a module called array that defines an array object that behaves
a lot like a list, except that its elements must all be the same type. The benefit of array over lists is
more efficient memory usage and faster performance. See the Python documentation [1] for more
about arrays.
The NumPy and SciPy libraries If you have any serious mathematical or scientific calculations
to do on arrays, you may want to consider the NumPy library. It is easy to download and install.
From the NumPy user’s guide:
There is also SciPy, which builds off of NumPy. This time from the SciPy user’s guide:
How Python generates random numbers The random number generator that Python uses is
called the Mersenne Twister. It is reliable and well-tested. It is a deterministic generator, meaning
22.8. RANDOM NUMBERS 227
that it uses a mathematical process to generate random numbers. The numbers are called pseudo-
random numbers because, coming from a mathematical process, they are not truly random, though,
for all intents and purposes, they appear random. Anyone who knows the process can recreate
the numbers. This is good for scientific and mathematical applications, where you want to be able
to recreate the same random numbers for testing purposes, but it is not suitable for cryptography,
where you need to generate truly random numbers.
Seeds For mathematical and scientific applications, you may want to reproduce the same random
numbers each time you run your program, in order to test your work with the same data. You can
do this by specifying the seed. Examples are shown below:
random.seed(1)
print("Seed 1:", [random.randint(1,10) for i in range(5)])
random.seed(2)
print("Seed 2:", [random.randint(1,10) for i in range(5)])
random.seed(1)
print("Seed 1:",[random.randint(1,10) for i in range(5)])
Seed 1: [2, 9, 8, 3, 5]
Seed 2: [10, 10, 1, 1, 9]
Seed 1: [2, 9, 8, 3, 5]
The seed can be any integer. If we just use random.seed(), then the seed will be more or less
randomly selected based off of the system clock.
The random function Most of the functions in the random module are based off of the random
function, which uses the Mersenne Twister to generate random numbers between 0 and 1. Mathe-
matical transformations are then used on the result of random to get some of the more interesting
random number functions.
Other functions in the random module The random module contains functions that return ran-
dom numbers from various distributions, like the Gaussian or exponential distributions. For in-
stance, to generate a Gaussian (normal) random variable, use the gauss function. Here are some
examples:
random.gauss(64,3.5)
[round(random.gauss(64,3.5),1) for i in range(10)]
61.37965975173485
[58.4, 61.0, 67.0, 67.9, 63.2, 65.0, 64.5, 63.4, 65.5, 67.3]
The first argument of gauss is the mean and the second is the standard deviation. If you’re not fa-
miliar with normal random variables, they are the standard bell curve. Things like heights and SAT
scores and many other real-life things approximately fit this distribution. In the example above,
the random numbers generated are centered around 64. Numbers closer to 64 are more likely to be
generated than numbers farther away.
228 CHAPTER 22. MATH
There are a bunch of other distributions that you can use. The most common is the uniform distri-
bution, in which all values in a range are equally likely. For instance:
random.uniform(3,8)
7.535110252245726
See the Python documentation [1] for information on the other distributions.
A more random randint function One way to generate cryptographically safe random num-
bers is to use some fairly random physical process. Examples include radioactive decay, atmo-
spheric phenomena, and the behavior of certain electric circuits. The os module has a function
urandom that generates random numbers from a physical process whose exact nature depends on
your system. The urandom function takes one argument telling it how many bytes of random data
to produce. Calling urandom(1) produces one byte of data, which we can translate to an integer
between 0 and 255. Calling urandom(2) produces two bytes of data, translating to integers be-
tween 0 and 65535. Here is a function that behaves like randint, but uses urandom to give us
nondeterministic random numbers:
from os import urandom
from math import log
def urandint(a,b):
x = urandom(int(log(b-a+1)/log(256))+1)
total = 0
for (i,y) in enumerate(x):
total += y*(2**i)
return total%(b-a+1)+a
The way this works is we first have to determine how many bytes of random data to generate.
Since one byte gives 256 possible values and two bytes give 2562 possible values, etc., we compute
the log base 256 of the size of the range b-a+1 to determine how many byes to generate. We then
loop through the bytes generated and convert them to an integer. Finally, modding that integer by
b-a+1 reduces that integer to a number between 0 and b-a+1, and adding a to that produces an
integer in the desired range.
Hexadecimal, octal, and binary Python has built-in functions hex, oct, and bin for converting
integers to hexadecimal, octal, and binary. The int function converts those bases to base 10. Here
are some examples:
hex(250)
oct(250)
bin(250)
int(0xfa)
'0xfa'
22.10. USING THE PYTHON SHELL AS A CALCULATOR 229
'0o372'
'0b11111010'
250
Hexadecimal values are prefaced with 0x, octal values are prefaced with 0o and binary values are
prefaced with 0b.
The int function The int function has an optional second argument that allows you to specify
the base you are converting from. Here are a few examples:
int( '101101 ', 2) # convert from base 2
int( '121212 ', 3) # convert from base 3
int( '12A04 ', 11) # convert from base 11
int( '12K04 ', 23) # convert from base 23
45
455
18517
314759
The pow function Python has a built-in function called pow, which raises numbers to powers. It
behaves like the ** operator, except that it takes an optional third argument that specifies a modu-
lus. Thus pow(x,y,n) returns (x**y)%n. The reason you might want to use this is that the pow
way is much quicker when very large numbers are involved, which happens a lot in cryptographic
applications.
I often use the Python shell as a calculator. This section contains a few tips for working at the shell.
Importing math functions One good way to start a session at the shell is to import some math
functions:
from math import *
Special variable There is a special variable _ which holds the value of the previous calculation.
Here is an example:
>>> 23**2
529
>>> _+1
530
230 CHAPTER 22. MATH
Logarithms I use the natural logarithm a lot, and it is more natural for me to type ln instead of
log. If you want to do that, just do the following:
ln = log
P∞ 1
Summing a series Here is a way to get an approximate sum of a series, in this case n=1 n2 −1 :
Another example: Say you need the sine of each of the angles 0, 15, 30, 45, 60, 75, and 90. Here is
a quick way to do that:
Third-party modules There are a number of other third-party modules that you might find useful
when working in the Python shell. For instance, there is Numpy and Scipy, which we mentioned
in Section 22.7. There is also Matplotlib, a versatile library for plotting things, and there is Sympy,
which does symbolic computations.
Chapter 23
This chapter covers a number of topics to do with functions, including some topics in functional
programming.
Python functions are said to be first-class functions, which means they can be assigned to variables,
copied, used as arguments to other functions, etc., just like any other object.
f(3) = 9
g(3) = 9
25
125
231
232 CHAPTER 23. WORKING WITH FUNCTIONS
Here is another example. Say you have a program with ten different functions and the program
has to decide at runtime which function to use. One solution is to use ten if statements. A shorter
solution is to use a list of functions. The example below assumes that we have already created
functions f1, f2, . . . , f10, that each take two arguments.
funcs = [f1, f2, f3, f4, f5, f6, f7, f8, f9, f10]
num = eval(input( 'Enter a number: '))
funcs[num]((3,5))
Functions as arguments to functions Say we have a list of 2-tuples. If we sort the list, the sorting
is done based off of the first entry as below:
L = [(5,4), (3,2), (1,7), (8,1)]
L.sort()
Suppose we want the sorting to be done based off the second entry. The sort method takes an
optional argument called key, which is a function that specifies how the sorting should be done.
Here is how to sort based off the second entry:
def comp(x):
return x[1]
L = [(5,4), (3,2), (1,7), (8,1)]
L.sort(key=comp)
Here is another example, where we sort a list of strings by length, rather than alphabetically.
L = [ 'this ', 'is ', 'a ', 'test ', 'of ', 'sorting ']
L.sort(key=len)
One other place we have seen functions as arguments to other functions is the callback functions of
Tkinter buttons.
In one of the examples above, we passed a comparison function to the sort method. Here is the
code again:
def comp(x):
return x[1]
L.sort(key=comp)
If we have a really short function that we’re only going to use once, we can use what is called an
anonymous function, like below:
23.3. RECURSION 233
L.sort(key=lambda x: x[1])
The lambda keyword indicates that what follows will be an anonymous function. We then have
the arguments to the function, followed by a colon and then the function code. The function code
cannot be longer than one line.
We used anonymous functions back when working with GUIs to pass information about which
button was clicked to the callback function. Here is the code again:
for i in range(3):
for j in range(3):
b[i][j] = Button(command = lambda x=i,y=j: function(x,y))
23.3 Recursion
Recursion is the process where a function calls itself. One of the standard examples of recursion
is the factorial function. The factorial, n!, is the product of all the numbers from 1 up to n. For
instance, 5! = 5·4·3·2·1 = 120. Also, by convention, 0! = 1. Recursion involves defining a function
in terms of itself. Notice that, for example, 5! = 5 · 4!, and in general, n! = n · (n − 1)!. So the factorial
function can be defined in terms of itself. Here is a recursive version of the factorial function:
def fact(n):
if n==0:
return 1
else:
return n*fact(n-1)
We must specify the n = 0 case or else the function would keep calling itself forever (or at least until
Python generates an error about too many levels of recursion).
Note that the math module has a function called factorial, so this version here is just for demon-
stration. Note also that there is a non-recursive way to do the factorial, using a for loop. It is about
as straightforward as the recursive way, but faster. However, for some problems the recursive so-
lution is the more straightforward solution. Here, for example, is a program that factors a number
into prime factors.
The factor function takes two arguments: a number to factor, and a list of previously found
factors. It checks for a factor, and if it finds one, it appends it to the list. The recursive part is that it
divides the number by the factor that was found and then appends to the list all the factors of that
value. On the other hand, if the function doesn’t find any factors, it appends the number to the list,
as it must be a prime, and returns the new list.
234 CHAPTER 23. WORKING WITH FUNCTIONS
map and filter Python has a built-in functions called map and filter that are used to apply
functions to the contents of a list. They date back to before list comprehensions were a part of
Python, but now list comprehensions can accomplish everything these functions can. Still, you
may occasionally see code using these functions, so it is good to know about them.
The map function takes two arguments—a function and an iterable—and it applies the function
to each element of the iterable, generating a new iterable. Here is an example that takes a list of
strings a returns a list of the lengths of the strings. The first line accomplishes this with map, while
the second line uses list comprehensions:
L = list(map(len, [ 'this ', 'is ', 'a ', 'test ']))
L = [len(word) for word in [ 'this ', 'is ', 'a ', 'test ']]
The function filter takes a function and an iterable and returns an iterable of all the elements of
the list for which the function is true. Here is an example that returns all the words in a list that
have length greater than 2. The first line uses filter to do this, and the second line does it with a
list comprehension:
L = list(filter(lambda x: len(x)>2, [ 'this ', 'is ', 'a ', 'test ']))
L = [word for word in [ 'this ', 'is ', 'a ', 'test '] if len(word)>2]
Here is one approach to finding the number of items in a list L that are greater than 60:
count=0
for i in L:
if i>60:
count = count + 1
Here is a second way using a list comprehension similar to the filter function:
len([i for i in L if i>60])
reduce There is another function, reduce, that applies a function to the contents of a list. It
used to be a built-in function, but in Python 3 it has also been moved to the functools module.
This function cannot be easily replaced with list comprehensions. To understand it, first consider a
simple example that adds up the numbers from 1 to 100.
total = 0
for i in range(1,101):
total = total + i
In general, reduce takes a function and an iterable, and applies the function to the elements from
left to right, accumulating the result. As another simple example, the factorial function could be
implemented using reduce:
23.5. THE OPERATOR MODULE 235
def fact(n):
return reduce(lambda x,y:x*y, range(1,n+1))
In the previous section, when we needed a function to represent a Python operator like addition or
multiplication, we used an anonymous function, like below:
total = reduce(lambda x,y: x+y, range(1,101))
Python has a module called operator that contains functions that accomplish the same thing as
Python operators. These will run faster than anonymous functions. We can rewrite the above
example like this:
from operator import add
total = reduce(add, range(1,101))
The operator module has functions corresponding arithmetic operators, logical operators, and
even things like slicing and the in operator.
You may want to write a function for which you don’t know how many arguments will be passed
to it. An example is the print function where you can enter however many things you want to
print, each separated by commas.
Python allows us to declare a special argument that collects several other arguments into a tuple.
This syntax is demonstrated below:
def product(*nums):
prod = 1
for i in nums:
prod*=i
return prod
print(product(3,4), product(2,3,4), sep= '\n ')
12
24
There is a similar notation, **, for collecting an arbitrary number of keyword arguments into a
dictionary. Here is a simple example:
def f(**keyargs):
for k in keyargs:
print(k, '**2 : ', keyargs[k]**2, sep= '')
f(x=3, y=4, z=5)
y**2 : 16
236 CHAPTER 23. WORKING WITH FUNCTIONS
x**2 : 9
z**2 : 25
You can also use these notations together with ordinary arguments. The order matters—arguments
collected by * have to come after all positional arguments and arguments collected by ** always
come last. Two example function declarations are shown below:
def func(a, b, c=5, *d, **e):
def func(a, b, *c, d=5, **e):
Calling functions The * and ** notations can be used when calling a function, too. Here is an
example:
def f(a,b):
print(a+b)
x=(3,5)
f(*x)
This will print 8. In this case we could have more simply called f(3,5), but there are situations
when that is not possible. For example, maybe you have several different sets of arguments that
your program might use for a function. You could have several if statements, but if there are a lot
of different sets of arguments, then the * notation is much easier. Here is a simple example:
def f(a,b):
print(a+b)
args = [(1,2), (3,4), (5,6), (7,8), (9,10)]
i = eval(input( 'Enter a number from 0 to 4: '))
f(*args[i])
One use for the ** notation is simplifying Tkinter declarations. Suppose we have several widgets
that all have the same properties, say the same font, foreground color, and background color. Rather
than repeating those properties in each declaration, we can save them in a dictionary and then use
the ** notation in the declarations, like below:
args = { 'fg ': 'blue ', 'bg ': 'white ', 'font ':( 'Verdana ', 16, 'bold ')}
label1 = Label(text= 'Label 1 ', **args)
label2 = Label(text= 'Label 2 ', **args)
apply Python 2 has a function called apply which is, more or less, the equivalent of * and **
for calling functions. You may see it in older code.
Function variables that retain their values between calls Sometimes it is useful to have variables
that are local to a function that retain their values between function calls. Since functions are objects,
we can accomplish this by adding a variable to the function as if it were a more typical sort of object.
In the example below the variable f.count keeps track of how many times the function is called.
def f():
f.count = f.count+1
print(f.count)
f.count=0
Chapter 24
The itertools and collections modules contain functions that can greatly simplify some com-
mon programming tasks. We will start with some functions in itertools.
Permutations The permutations of a sequence are rearrangements of the items of that sequence.
For example, some permutations of [1,2,3] are [3,2,1] and [1,3,2]. Here is an example that
shows all the possibilities:
list(permutations([1,2,3]))
We can find the permutations of any iterable. Here are the permutations of the string '123':
[ ''.join(p) for p in permutations( '123 ')]
The permutations function takes an optional argument that allows us to specify the size of the
permutations. For instance, if we just want all the two-element substrings possible from '123', we
can do the following:
[ ''.join(p) for p in permutations( '123 ', 2)]
Note that permutations and most of the other functions in the itertools module return an
iterator. You can loop over the items in an iterator and you can use list to convert it to a list.
237
238 CHAPTER 24. THE ITERTOOLS AND COLLECTIONS MODULES
Combinations If we want all the possible k-element subsets from a sequence, where all that mat-
ters is the elements, not the order in which they appear, then what we want are combinations.
For instance, the 2-element subsets that can be made from {1, 2, 3} are {1, 2}, {1, 3} and {2, 3}. We
consider {1, 2} and {2, 1} as being the same because they contain the same elements. Here is an
example showing the combinations of two-element substrings possible from '123':
[ ''.join(c) for c in combinations( '123 ', 2)]
Combinations with replacement For combinations with repeated elements, use the function
combinations_with_replacement.
[ ''.join(c) for c in combinations_with_replacement( '123 ', 2)]
The function product produces an iterator from the Cartesian product of iterables. The Cartesian
product of two sets X and Y consists of all pairs (x, y) where x is in X and y is in Y . Here is a short
example:
[ ''.join(p) for p in product( 'abc ', '123 ')]
Example To demonstrate the use of product, here are three progressively shorter, and clearer
ways to find all one- or two-digit Pythagorean triples (values of (x, y, z) satisfying x 2 + y 2 = z 2 ).
The first way uses nested for loops:
for x in range(1,101):
for y in range(1,101):
for z in range(1,101):
if x**2+y**2==z**2:
print(x,y,z)
The groupby function is handy for grouping things. It breaks a list up into groups by tracing
through the list and every time there is a change in values, a new group is created. The groupby
function returns ordered pairs that consist of a list item and groupby iterator object that contains
the group of items.
L = [0, 0, 1, 1, 1, 2, 0, 4, 4, 4, 4, 4]
for key,group in groupby(L):
print(key, ': ', list(group))
0 : [0, 0]
1 : [1, 1, 1]
2 : [2]
0 : [0]
4 : [4, 4, 4, 4, 4]
Notice that we get two groups of zeros. This is because groupby returns a new group each time
there is a change in the list. In the above example, if we instead just wanted to know how many of
each number occur, we can first sort the list and then call groupby.
L = [0, 0, 1, 1, 1, 2, 0, 4, 4, 4, 4, 4]
L.sort()
for key,group in groupby(L):
print(key, ': ', len(list(group)))
0 : 3
1 : 3
2 : 1
4 : 5
Most of the time, you will want to sort your data before calling groupby.
Optional argument The groupby function takes an optional argument that is a function telling
it how to group things. When using this, you usually have to first sort the list with that function as
the sorting key. Here is an example that groups a list of words by length:
L = [ 'this ', 'is ', 'a ', 'test ', 'of ', 'groupby ']
L.sort(key = len)
for key,group in groupby(L, len):
print(key, ': ', list(group))
1 : [ 'a ' ]
2 : ['is', 'of']
4 : ['test', 'this']
7 : ['groupby']
First, suppose L is a list of zeros and ones, and we want to find out how long the longest run of
ones is. We can do that in one line using groupby:
max([len(list(group)) for key,group in groupby(L) if key==1])
Second, suppose we have a function called easter that returns the date of Easter in a given year.
The following code will produce a histogram of which dates occur most often from 1900 to 2099.
L = [easter(Y) for Y in range(1900,2100)]
L.sort()
for key,group in groupby(L):
print(key, ': ', '* '*(len(list(group)))
chain The chain function chains iterators together into one big iterator. For example, if you
have three lists, L, M, and N and want to print out all the elements of each, one after another, you
can do the following:
for i in chain(L,M,N):
print(i)
As another example, in Section 8.6 we used list comprehensions to flatten a list of lists, that is to
return a list of all the elements in the lists. Here is another way to do that using chain:
L = [[1,2,3], [2,5,5], [7,8,3]]
list(chain(*tuple(L)))
[1, 2, 3, 2, 5, 5, 7, 8, 3]
count The function count() behaves like range(∞). It takes an optional argument so that
count(x) behaves like range(x,∞).
cycle The cycle function cycles over the elements of the iterator continuously. When it gets to
the end, it starts over at the beginning, and keeps doing this forever. The following simple example
prints the numbers 0 through 4 continuously until the user enters an 'n':
for x in cycle(range(5)):
z = input( 'Keep going? y or n: ')
if z== 'n ':
break
print(x)
More about iterators There are a number of other functions in the itertools module. See the
Python documentation [1] for more. It has a nice table summarizing what the various functions do.
24.5. COUNTING THINGS 241
The collections module has a useful class called Counter. You feed it an iterable and the
Counter object that is created is something very much like a dictionary whose keys are items from
the sequence and whose values are the number of occurrences of the keys. In fact, Counter is a
subclass of dict, Python’s dictionary class. Here is an example:
Counter( 'aababcabcdabcde ')
Since Counter is a subclass of dict, you can access items just like in a dictionary, and most of the
usual dictionary methods work. For example:
c = Counter( 'aababcabcdabcde ')
c[ 'a ']
list(c.keys())
list(c.values())
5
[ 'a' , ' c' , ' b' , ' e' , ' d ' ]
[5, 3, 4, 1, 2]
Getting the most common items This most_common method takes an integer n and returns a list
of the n most common items, arranged as (key, value) tuples. For example:
c = Counter( 'aababcabcdabcde ')
c.most_common(2)
If we omit the argument, it returns tuples for every item, arranged in decreasing order of fre-
quency. To get the least common elements, we can use a slice from the end of the list returned by
most_common. Here is some examples:
c = Counter( 'aababcabcdabcde ')
c.most_common()
c.most_common()[-2:]
c.most_common()[-2::-1]
[('a', 5), ('b', 4), ('c', 3), ('d', 2), ('e', 1)]
[('d', 2), ('e', 1)]
[('d', 2), ('c', 3), ('b', 4), ('a', 5)]
The last example uses a negative slice index to reverse the order to least to most common.
An example Here is a really short program that will scan through a text file and create a Counter
object of word frequencies.
242 CHAPTER 24. THE ITERTOOLS AND COLLECTIONS MODULES
To pick out only those words that occur more than five times, we can do the following:
[word for word in c if c[word]>5]
Math with counters You can use some operators on Counter objects. Here is some examples:
c = Counter( 'aabbb ')
d = Counter( 'abccc ')
c+d
c-d
c&d
c|d
Doing c+d combines the counts from c and d, whereas c-d subtracts the counts from d from the
corresponding counts of c. Note that the Counter returned by c-d does not include 0 or negative
counts. The & stands for intersection and returns the minimum of the two values for each item, and
| stands for union and returns the maximum of the two values.
24.6 defaultdict
The collections module has another dictionary-like class called defaultdict. It is almost
exactly like an ordinary dictionary except that when you create a new key, a default value is given
to the key. Here is an example that mimics what the Counter class does.
s = 'aababcabcdabcd '
dd = defaultdict(int)
for c in s:
dd[c]+=1
If we had tried this with dd just a regular dictionary, we would have gotten an error the first time
the program reached dd[c]+=1 because dd[c] did not yet exist. But since we declared dd to be
defaultdict(int), each value is automatically assigned a value of 0 upon creation, and so we
avoid the error. Note that we could use a regular dictionary if we add an if statement into the
loop, and there is also a function of regular dictionaries that allows you to set a default value, but
defaultdict runs faster.
We can use types other than integers. Here is an example with strings:
s = 'aababcabcdabcd '
dd = defaultdict(str)
for c in s:
dd[c]+= '* '
Use list for lists, set for sets, dict for dictionaries, and float for floats. You can use various
other classes, too. The default value for integers is 0, for lists is [], for sets is set(), for dictionaries
is {} and for floats is 0.0. If you would like a different default value, you can use an anonymous
function like below:
dd = defaultdict(lambda:100)
Used with the code from the first example, this will produce:
Exceptions
25.1 Basics
If you are writing a program that someone else is going to use, you don’t want it to crash if an error
occurs. Say your program is doing a bunch of calculations, and at some point you have the line
c=a/b. If b ever happens to be 0, you will get a division by zero error and the program will crash.
Here is an example:
a = 3
b = 0
c = a/b
print( 'Hi there ')
Once the error occurs, none of the code after c=a/b will get executed. In fact, if the user is not
running the program in IDLE or some other editor, they won’t even see the error. The program will
just stop running and probably close.
When an error occurs, an exception is generated. You can catch this exception and allow your pro-
gram to recover from the error without crashing. Here is an example:
a = 3
b = 0
try:
c=a/b
except ZeroDivisionError:
print( 'Calculation error ')
print( 'Hi there ')
Calculation error
245
246 CHAPTER 25. EXCEPTIONS
Hi There
Different possibilities We can have multiple statements in the try block and also and multiple
except blocks, like below:
try:
a = eval(input( 'Enter a number: '))
print (3/a)
except NameError:
print( 'Please enter a number. ')
except ZeroDivisionError:
print("Can 't enter 0.")
Not specifying the exception You can leave off the name of the exception, like below:
try:
a = eval(input( 'Enter a number: '))
print (3/a)
except:
print( 'A problem occurred. ')
It is generally not recommended that you do this, however, as this will catch every exception,
including ones that maybe you aren’t anticipating when you write the code. This will make it hard
to debug your program.
Using the exception When you catch an exception, information about the exception is stored in
an Exception object. Below is an example that passes the name of the exception to the user:
try:
c = a/0
except Exception as e:
print(e)
25.2 Try/except/else
You can use an else clause along with try/except. Here is an example:
try:
file = open( 'filename.txt ', 'r ')
except IOError:
print( 'Could not open file ')
else:
s = file.read()
print(s)
25.3. TRY/FINALLY AND WITH/AS 247
In this example, if filename.txt does not exist, an input/output exception called IOError is
generated. On the other hand, if the file does exist, there may be certain things that we want to do
with it. We can put those in the else block.
There is one more block you can use called finally. Things in the finally block are things that
must be executed whether or not an exception occurs. So even if an exception occurs and your
program crashes, the statements in the finally block will be executed. One such thing is closing
a file. If you just put the file closing after the try block and not in the finally block, the program
would crash before getting to close the file.
f = open( 'filename.txt ', 'w ')
s = 'hi '
try:
# some code that could potentially fail goes here
finally:
f.close()
The finally block can be used along with except and else blocks. This sort of thing with files
is common enough that it is has its own syntax:
s = 'hi '
with open( 'filename.txt ') as f:
print(s, file=f)
This is an example of something called a context manager. Context managers and try/finally are
mostly used in more complicated applications, like network programming.
There is a lot more that can be done with exceptions. See the Python documentation [1] for all the
different types of exceptions. In fact, not all exceptions come from errors. There is even a statement
called raise that you can use to raise your own exceptions. This is useful if you are writing your
own classes to be used in other programs and you want to send messages, like error messages, to
people who are using your class.
248 CHAPTER 25. EXCEPTIONS
Bibliography
[The Python documentation is terrific. It is nicely formatted, extensive, and easy to find things.]
[4] Lutz, Marc. Learning Python, 5th ed. O’Reilly Media, 2013.
[I first learned Python from the third edition. It is long, but has a lot of good information.]
[5] Lutz, Marc. Programming Python, 4th ed. O’Reilly Media, 2011.
[This is a more advanced book. There is some good information in here, especially the Tkinter chapters.]
[6] Beazley, Jeff. The Python Essential Reference, 4th ed. Addison-Wesley Professional, 2009.
249
Index
abs, 22 dir, 22
anonymous functions, 232 directories, 110, 202–204
apply, 236 changing, 202
arrays, 226 creating, 203
assignment deleting, 203
shortcuts, 90 getting current directory, 202
listing files, 202
bin, 228 scanning subdirectories, 204
booleans, 89 downloading files, 205
break, 78
break/else, 79 enumerate, 192
escape characters, 48
cartesian product, 238 eval, 4, 6, 43, 191
classes, 130 exceptions, 245–247
collections, 241–243 exec, 191
Counter, 241
defaultdict, 242 files
combinations, 238 copying, 203
comments, 37, 196 deleting, 203
complex numbers, 224 reading, 109
constructors, 130 renaming, 203
continuation, 91 writing, 110
continue, 190 floating point numbers, 19
copy, 193 comparing, 221
for loops, 11–15
datetime, 201 nested, 93
debugging, 37–38 fractions, 221
decimal, 222 functions, 119–125
deepcopy, 193 anonymous, 147, 162
dict, 102 arguments, 120, 235
dictionaries, 99–104 default arguments, 122
changing, 100 first class functions, 231–232
copying, 101 keyword arguments, 122
in, 101 returning multiple values, 121
items, 102 returning values, 121
looping, 102 functools, 234
values, 102 filter, 234
250
INDEX 251
frames, 155
grid, 145
images, 157, 179
IntVar, 159
labels, 144
menu bars, 174
message boxes, 170
new windows, 174
pack, 175
PhotoImage, 157
radio buttons, 160
scales (sliders), 161
scheduling events, 172
ScrolledText, 160
StringVar, 175
Text, 160
title bar, 169
updating the screen, 171
widget state, 169
try, 245
tuple, 187
tuples, 35, 187
sorting, 103
Unicode, 189
urandom, 228
urllib, 205
variables
counts, 33–34
flags, 36
global, 123, 148
local, 123
maxes and mins, 36
naming conventions, 9
sums, 34–35
swapping, 35
zip, 193
zip files, 204
zipfile, 204