Network Programming
Network Programming
Lecture 2
HTTP and Working with the Web
• The urllib package is broken into several submodules for dealing with the
different tasks that we may need to perform when working with HTTP.
• For making requests and receiving responses, we employ the urllib.request
module.
Requests with urllib
• We get the data of the requested resource through a file-like interface using
the readline() and read() methods.
• This is how we use the read() method:
Response objects
• The file-like interface is limited. Once the data has been read, it's not
possible to go back and re-read it by using either of the aforementioned
functions. To demonstrate this, try doing the following:
Status codes
Status codes are integers that tell us how the request went.
Handling problems
• Any status code in the 200 range indicates a success, whereas any code in either the
400 range or the 500 range indicates failure.
Handling problems
• Requests, and responses are made up of two main parts, headers and a
body.
• We briefly saw some HTTP headers when we used our TCP RFC
downloader in Chapter 1, Network Programming and Python.
• Headers are the lines of protocol-specific information that appear at
the beginning of the raw message that is sent over the TCP connection.
• It is separated from the body by a blank line.
HTTP headers
• The HTTP protocol allows the client to supply the hostname in the
HTTP request by including a Host header
HTTP headers
• the server can use headers to inform the client about things such as the
length of the body, the type of content the response body contains, and
the cookie data that the client should store
Customizing requests
• We can check if the response is in Swedish by printing out the first few lines:
Customizing requests
The urlopen() method adds some of its own headers when we run it on a
request:
Customizing requests
• A shortcut for adding headers is to add them at the same time that we
create the request object, as shown here:
Content compression
• We can then decompress the body data by using the gzip module:
Content compression
• Let's see what happens if we ask for no compression by using the identity encoding:
Multiple values