PP Handout 4
PP Handout 4
EXCEPTION HANDLING
The errors in any language can be of two types
1. Syntax Errors:-
2. Runtime Errors
i.e. An error can be a syntax error or an exception.
Syntax Errors: The errors which occur due to invalid syntax are called syntax errors. Syntax
errors are detected when we have not followed the rules of the particular programming language
while writing a program. These errors are also known as parsing errors. On encountering a
syntax error, the interpreter does not execute the program unless we rectify the errors,
Example:1
x=10
if x==10 ( : Colon is missing here , which will cause error)
print("Hello")
Example:1
print "Hello" ( Parenthesis is missing here , which will cause error)
Runtime Errors:
It is also known as exception. An exception is an error that happens during the execution of a program.
Example:
print(10/0) # This will cause ZeroDivisionError: division by zero
print(10/"a") # This will cause TypeError: unsupported operand type(s) for / : 'int' and 'str'
Output:
Traceback (most recent call last):
File "c:\Users\hp\ex24.py", line 23, in <module>
print(10/0)
ZeroDivisionError: division by zero
What is Exception?
An exception is an unexpected even that disturbs normal flow of execution. Even if a statement or
expression is syntactically correct, there might arise an error during its execution. For example, trying to
open a file that does not exist, division by zero and so on.
1. Built-in exceptions: They are also called pre-defined exceptions. Python‘s standard library is an
extensive collection of built-in exceptions that deals with the commonly occurring errors
(exceptions) by providing the standardized solutions for such errors.
On the occurrence of any built-in exception, the appropriate exception handler code is executed
which displays the reason along with the raised exception name
Exception handling allows the program to continue running even if an error occurs. It is an alternative
way to continue with program execution normally. This will help in avoiding abnormal termination of
the program.
Exception handling in Python is achieved using the try, except, else and finally blocks.
try:
# code that may cause exception
The try block contains the code that can have an error. If an error occurs in the try block the script will
continue to the except blocks. If no errors were given it will continue to the else block, or finally block if no
else block is present.
except <error type>:
# executed if the try block throws an error. This block can handle specific error like
(ZeroDivisionError , AssertionError etc), or it can catch all errors when left blank.
else:
# This block executes, if try block executes successfully without errors
If the try block runs without errors the else block will be triggered. If there is an error, no matter
if it is caught or not, the else block will be skipped
finally:
# This block is always executed
After the try block, except, and else, the finally block will always run. Even if the error in the try block was
not caught or the error occurred in an except or else block.
try:
Note: When multiple except blocks are used then the order of these except blocks is important. Python interpreter
will always consider from top to bottom until matched except block identified.
We can write a single except block that can handle multiple different types of exceptions.
except (Exception1,Exception2,exception3,..):
(or)
except (Exception1,Exception2,exception3,..) as msg :
Example:
try:
x=int(input("Enter First Number: "))
y=int(input("Enter Second Number: "))
print(x/y)
except (ZeroDivisionError,ValueError) as msg:
print("Plz Provide valid numbers only and problem is: ",msg)
There can be more such cases where the values entered by the user or some piece of code are considered
invalid for our program. In those cases, we can manually raise an exception.
We can use raise to throw an exception if a condition occurs. The statement can be complemented with
a custom exception.
The following example explains how to throw an error, using raise, when a certain condition occurs
except ValueError:
print("ValueError Exception thrown")
In the example above, we are taking the input from the user inside try clause. Then we are checking if
the number entered by the user is non-positive. If it is, then we are manually raising the ValueError
exception by writing raise ValueError(). We are also handling this exception using an except clause. We
Compiled by – G Sreenivasulu, Associate Professor of CSE, Page 2
chose to raise ValueError because normally ValueError exceptions are raised when some value is
incorrect. Though, we can raise any other type of exception as well if the entered value is not positive, it
is advisable to choose the exception type that best matches the reason you are raising it.
It should be noted that all classes are subclasses of the object class. The object class is a built-in class in
Python. Similarly, all the exception classes are direct or indirect subclasses of the built-in Exception
class. Thus, exceptions like IndexError, TypeError, ValueError, etc are subclasses of the Exception
class.
Example1:
class MyCustomError(Exception):
pass
MyCustomError which inherits the Exception class and raised this exception using the raise keyword by
writing raise MyCustomError.
Example 2:
#defining exceptions
class Error(Exception):
"""Base class for all exceptions"""
pass
class PasswordSmallError(Error):
"""Raised when the input password is small"""
pass
class PasswordLargeError(Error):
"""Raised when the input password is large"""
pass
try:
password = input("Enter a password")
if len(password) < 6:
raise PasswordSmallError("Password is short!")
try:
res = 190 / 0
except Exception as error:
# handle the exception
print("An exception occurred:", type(error).__name__) # An exception occurred: ZeroDivisionError
try:
print("Here's variable x:", x)
except Exception as error:
print("An error occurred:", type(error).__name__, "–", error) # An error occurred: NameError – name 'x' is
not defined
try:
res = 190 / 0
except Exception as error:
# handle the exception
print("An exception occurred:", type(error).__name__, "–", error) # An exception occurred:
ZeroDivisionError – division by zero
A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. RegEx can be
used to check if a string contains the specified search pattern. Python has a built-in package called re,
which can be used to work with Regular Expressions.
The Regex or Regular Expression is a way to define a pattern for searching or manipulating strings. We
can use a regular expression to match, search, replace, and manipulate inside textual data.
match()
fullmatch()
search()
findall()
finditer()
sub()
subn()
split()
compile()
match():
We can use the match function to check the given pattern at the beginning of the target string. If the
match is available then we will get a Match object, otherwise we will get None.
import re
s=input("Enter pattern to check: ")
m=re.match(s, "abcabdefg")
if m != None:
print("Match is available at the beginning of the String")
print("Start Index:", m.start(), "and End Index:", m.end())
else:
print("Match is not available at the beginning of the String")
Output:
Enter pattern to check: abc
Match is available at the beginning of the String
Start Index: 0 and End Index:
Output:
Enter pattern to check: bde
Match is not available at the beginning of the String
import re
s=input("Enter pattern to check: ")
m=re.fullmatch(s, "ababab")
if m != None :
print("Full String Matched")
else :
print("Full String not Matched")
Output
Enter pattern to check: ab
Full String not Matched
Output
Enter pattern to check: ababab
Full String Matched
search():
We can use the search() function to search the given pattern in the target string. If the match is
available, then it returns the Match object which represents the first occurrence of the match. If the
match is not available, then it returns None
import re
s=input("Enter pattern to check: ")
m=re.search(s, "abaaaba")
if m != None:
print("Match is available")
print("First Occurrence of match with start index:",m.start(),"and end index:", m.end())
else:
print("Match is not available")
Output
Enter pattern to check: aaa
Match is available
First Occurrence of match with start index: 2 and end index: 5
Output
Enter pattern to check: bbb
Match is not available
findall():
To find all occurrences of the match. This function returns a list object which contains all occurrences.
import re
Output
['7', '9', '5']
finditer():
Returns the iterator yielding a match object for each match. On each match object we can call start(),
end() and group() functions. Many examples have been covered with this method at the starting of the
chapter.
import re
itr=re.finditer("[a-z]","a7b9c5k8z")
for m in itr:
print(m.start(),"...",m.end(),"...", m.group())
Output
0 ... 1 ... a
2 ... 3 ... b
4 ... 5 ... c
6 ... 7 ... k
8 ... 9 ... z
sub():
sub means substitution or replacement
re.sub(regex, replacement, targetstring)
In the target string every matched pattern will be replaced with provided replacement.
import re
s=re.sub("[a-z]","#","a7b9c5k8z")
print(s)
Output
#7#9#5#8#
subn():
It is exactly the same as sub except it can also return the number of replacements. This function returns a
tuple where the first element is the result string and second element is the number of replacements.
import re
t=re.subn("[a-z]","#","a7b9c5k8z")
print(t)
print("The Result String:", t[0])
print("The number of replacements:", t[1])
Output
('#7#9#5#8#', 5)
split():
If we want to split the given target string according to a particular pattern then we should go for the
split() function. This function returns a list of all tokens.
import re
l=re.split("," , " sunny,bunny,chinny,vinny,jinny")
print(l)
for t in l:
print(t)
Output
sunny
bunny
chinny
vinny
jinny
^ symbol:
We can use ^ symbol to check whether the given target string starts with our provided pattern or not. If
the target string starts with Learn then it will return Match object, otherwise returns None.
Syntax
res=re.search(“^Learn”, s)
import re
s="Learning Python is Very Easy"
res=re.search("^Learn", s)
if res !=None:
print("Target String starts with Learn")
else:
print("Target String Not starts with Learn")
Output:
Target String starts with Learn
If we want to ignore case then we have to pass 3rd argument re.IGNORECASE for search() function.
res = re.search(“easy$”, s, re.IGNORECASE)
import re
s="Learning Python is Very Easy"
res=re.search("easy$", s, re.IGNORECASE)
if res !=None:
print("Target String ends with Easy by ignoring case")
else:
print("Target String Not ends with Easy by ignoring case")
SAMPLE PROGRAMS
Let‘s consider the following requirement to create a regex object.
PROGRAM 1:
1. The allowed characters are a-z, A-Z, 0-9,#
2. The first character should be a lower-case alphabet symbol from a to k
3. The second character should be a digit divisible by 3.
4. The length of the identifier should be at least 2.
Write a python program to check whether the given string is following above rules or not?
import re
s=input("Enter string:")
m=re.fullmatch("[a-k][0369][a-zA-Z0-9#]*",s)
if m!= None:
print(s, "Entered regular expression is matched")
else:
print(s, " Entered regular expression is not matched ")
PROGRAM 2:
Write a Regular Expression to represent all 10 digit mobile numbers
Rules:
1. Every number should contain exactly 10 digits
2. The first digit should be 7 or 8 or 9
Regular Expressions:
[7-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
or
[7-9][0-9]{9}
or
[7-9]\d{9}
import re
n=input("Enter number:")
m=re.fullmatch("[7-9]\d{9}",n)
if m!=None:
print("Valid Mobile Number")
else:
print("Please enter valid Mobile Number")
PROGRAM 3:
import re
f1=open("input.txt", "r")
f2=open("output.txt", "w")
for line in f1:
items = re.findall("[7-9]\d{9}",line)
for n in items:
f2.write(n+"\n")
print("Extracted all Mobile Numbers into output.txt")
f1.close()
f2.close()
PROGRAM 4:
Write a Python Program to check whether the given mail id is valid gmail id or not?
import re
s=input("Enter Mail id:")
m=re.fullmatch("\w[a-zA-Z0-9_.]*@gmail[.]com", s)
if m!=None:
print("Valid Mail Id")
else:
print("Invalid Mail id")
PROGRAM 5
Write a python program to check whether given car registration number is valid Telangana State
Registration number or not?
import re
s=input("Enter Vehicle Registration Number:")
m=re.fullmatch("TS[012][0-9][A-Z]{2}\d{4}",s)
if m!=None:
print("Valid Vehicle Registration Number");
else:
print("Invalid Vehicle Registration Number")
Write a python program to check whether given PAN number is valid or not?
The valid PAN Card number must satisfy the following conditions:
1. It should be ten characters long.
2. The first five characters should be any upper case alphabets.
3. The next four-characters should be any number from 0 to 9.
4. The last(tenth) character should be any upper case alphabet.
5. It should not contain any white spaces.
Where:
[A-Z]{5} represents the first five upper case alphabets which can be A to Z.
[0-9]{4} represents the four numbers which can be 0-9.
[A-Z]{1} represents the one upper case alphabet which can be A to Z.
Compiled by – G Sreenivasulu, Associate Professor of CSE, Page 10
^ –> Identifies the beginning of a string.
$ –> Recognizes end of the string.
? –> This resembles zero or one occurrence.
import re
if re.fullmatch("[A-Z]{5}[0-9]{4}[A-Z]{1}", "DWKPK3344E"):
print("Valid PAN Number ")
else:
print("Not a Valid PAN Number ")
Character Classes
With Character Classes or Character Sets you can tell the regex engine to match only one out of several
characters. Simply place the characters you want to match between square brackets. If you want to
match an a or an e, use [ae]. You could use this in gr[ae]y to match either gray or grey. A character class
matches only a single character. gr[ae]y does not match graay, gra ey or any such thing.
Output:
0 ...... a
2 ...... a
3 ...... a
Example:
import re
matcher=re.finditer("a+","abaabaaab")
for match in matcher:
print(match.start(),"......",match.group())
Output
0 ...... a
2 ...... aa
5 ...... aaa
Example:
import re
matcher=re.finditer("a*","abaabaaab")
for match in matcher:
print(match.start(),"......",match.group())
Output
0 ...... a
1 ......
2 ...... aa
4 ......
5 ...... aaa
8 ......
9 ......
Example:
import re
matcher=re.finditer("a?","abaabaaab")
for match in matcher:
print(match.start(),"......",match.group())
Output
0 ...... a
1 ......
2 ...... a
3 ...... a
4 ......
5 ...... a
6 ...... a
7 ...... a
8 ......
9 ......
Example:
import re
matcher=re.finditer("a{3}","abaabaaab")
Output
5 ...... aaa
Example
import re
matcher=re.finditer("a{2,4}","abaabaaab")
for match in matcher:
print(match.start(),"......",match.group())
Output
2 ...... aa
5 ...... aaa
Metacharacters are special characters that affect how the regular expressions around them are
interpreted. Metacharacters are characters with a special meaning. We can further classify
Metacharacters into identifier and modifiers.
Identifiers are used to recognise a certain type of characters. For example, to find all the number
characters in a string we can use an identifier ‗/d‘
import re
string="Hello I live on street 9 which is near street 23"
print(re.findall("\d",string))
Output:
[‗9‘, ‗2‘, ‗3‘]
In the above output it returned single-digit numbers and even double digit split into two digits. If we
use 2 identifiers but we can only find two-digit numbers. This problem can be solved with modifiers.
Modifiers are a set of Metacharacters that add more functionality to identifiers. As mentioned before,
we will see how we can use a modifier ― + ‖ to get numbers of any length from the string. This modifier
returns a string when it matches 1 or more characters.
Modifier Description
. (DOT) Matches any character except a newline.
^ (Caret) Matches pattern only at the start of the string.
$ (Dollar) Matches pattern at the end of the string
* (asterisk) Matches 0 or more repetitions of the regex.
+ (Plus) Match 1 or more repetitions of the regex.
? (Question mark) Match 0 or 1 repetition of the regex.
[](Square brackets) Used to indicate a set of characters. Matches any single character in brackets. For
example, [abc] will match either a, or, b, or c character
| (Pipe) used to specify multiple patterns. For example, P1|P2, where P1 and P2 are two
different regexes.
\ (backslash) Use to escape special characters or signals a special sequence. For example, If you
Compiled by – G Sreenivasulu, Associate Professor of CSE, Page 13
are searching for one of the special characters you can use a \ to escape them
[^...] Matches any single character not in brackets.
(...) Matches whatever regular expression is inside the parentheses. For example, (abc)
will match to substring 'abc'
Example:
import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by two (any) characters, and an "o":
x = re.findall("he..o", txt)
print(x)
Output:
['hello']
Example:
import re
txt = "The rain in Spain"
#Find all lower case characters alphabetically between "a" and "m":
x = re.findall("[a-m]", txt)
print(x)
Output:
['h', 'e', 'a', 'i', 'i', 'a', 'i']
Example:
import re
txt = "That will be 59 dollars"
#Find all digit characters:
x = re.findall("\d", txt)
print(x)
Output:
['5', '9']
Example:
import re
txt = "hello planet"
#Check if the string starts with 'hello':
x = re.findall("^hello", txt)
if x:
print("Yes, the string starts with 'hello'")
else:
print("No match")
Output:
Yes, the string starts with 'hello'
Example:
import re
txt = "hello planet"
#Check if the string ends with 'planet':
Output:
Yes, the string ends with 'planet'
Example:
import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by 0 or more (any) characters, and an "o":
x = re.findall("he.*o", txt)
print(x)
Output:
['hello']
Example
import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by 1 or more (any) characters, and an "o":
x = re.findall("he.+o", txt)
print(x)
Output:
['hello']
Example
import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by 0 or 1 (any) character, and an "o":
x = re.findall("he.?o", txt)
print(x)
#This time we got no match, because there were not zero, not one, but two characters between "he" and the
"o"
Output:
[]
Example
import re
txt = "The rain in Spain falls mainly in the plain!"
#Check if the string contains either "falls" or "stays":
x = re.findall("falls|stays", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")
Compiled by – G Sreenivasulu, Associate Professor of CSE, Page 15
Output:
['falls']
Yes, there is at least one match!
The word ―quantifier‖ originates from latin: it‘s meaning is quantus = how much / how often.
A greedy match means that the regex engine (the one which tries to find your pattern in the string)
matches as many characters as possible.
For example, the regex 'a+' will match as many 'a's as possible in your string 'aaaa'. Although the
substrings 'a', 'aa', 'aaa' all match the regex 'a+', it‘s not enough for the regex engine. It‘s always hungry
and tries to match even more.
>>> import re
>>> re.findall('a?', 'aaaa')
['a', 'a', 'a', 'a', '']
>>> re.findall('a*', 'aaaa')
['aaaa', '']
>>> re.findall('a+', 'aaaa')
['aaaa']
>>> re.findall('a{3}', 'aaaa')
['aaa']
>>> re.findall('a{1,2}', 'aaaa')
['aa', 'aa']
In all cases, a shorter match would also be valid. But as the regex engine is greedy per default, those are
not enough for the regex engine.
For example, the regex 'a+?' will match as few 'a's as possible in your string 'aaaa'. Thus, it matches the
first character 'a' and is done with it. Then, it moves on to the second character (which is also a match)
and so on
The non-greedy quantifiers give you the shortest possible match from a given position in the string
You can make the default quantifiers ?, *, +, {m}, and {m,n} non-greedy by appending a question mark
symbol '?' to them: ??, *?, +?, and {m,n}?. they ―consume‖ or match as few characters as possible so
that the regex pattern is still satisfied.
Backreferences
You can match a previously captured group later within the same regex using a special metacharacter
sequence called a backreference.
They are regular expression commands which refer to a previous part of the matched regular
expression.
Compiled by – G Sreenivasulu, Associate Professor of CSE, Page 17
Backreferences in a pattern allow you to specify that the contents of an earlier capturing group must also
be found at the current location in the string.
For example, \1 will succeed if the exact contents of group 1 can be found at the current position, and
fails otherwise.
Example 1:
The below example we want to find all the duplicated words in the given text.
import re
txt = """
hello hello
how are you
bye bye
"""
p= re.compile("(\w+) \\1")
print(p.findall(txt))
Output:
['hello', 'bye']
Note: Since Python’s string literals also use a backslash followed by numbers to allow including arbitrary
characters in a string, backreferences need to be escaped so that regex engine gets proper format. We can also
use raw strings to ignore escaping.
import re
txt = """
hello hello
how are you
bye bye
"""
p = re.compile(r"(\w+) \1")
print(p.findall(txt))
Output
['hello', 'bye']
Example 2
Consider a scenario where we want to find all dates with the format dd/mm/yyyy and change them to
yyyy-mm-dd format.
import re
txt = """
today is 23/02/2019.
yesterday was 22/02/2019.
Output
today is 2019-02-23.
yesterday was 2019-02-22.
tomorrow is 2019-02-24.
Note: Any time you use a regex in Python with a numbered backreference, it‘s a good idea to specify it
as a raw string. Otherwise, the interpreter may confuse the backreference with an octal value.
Example:
import re
print(re.search("([a-z])#\1", "d#d"))
Output
None
The regex ([a-z])#\1 matches a lowercase letter, followed by '#', followed by the same lowercase letter.
The string in this case is 'd#d', which should match. But the match fails because Python misinterprets the
backreference \1 as the character whose octal value is one:
import re
print(re.search("([a-z])#\\1", "d#d"))
Output:
<re.Match object; span=(0, 3), match='d#d'>
You‘ll achieve the correct match if you specify the regex as a raw string:
import re
print(re.search(r'([a-z])#\1', 'd#d'))
Output
<re.Match object; span=(0, 3), match='d#d'>