Unit 2
Unit 2
1. NumPy
Data science libraries including SciPy, Matplotlib, Pandas, Scikit-Learn and Statsmodels are
built on top of NumPy.
2. Pandas
Developed by Wes McKinney, Pandas is used for data manipulation and analyses. It provides
fast, flexible and expressive data structures and provides features such as handling of missing
data, fancy indexing and data alignment.
Pandas provides fast, flexible and expressive data structures that helps developers work with
labelled and relational data. It is based on two main data structures– Series, and Frames.
3. Seaborn
Seaborn is Python’s most commonly used library for statistical data visualisation, used for
heatmaps and visualisations that summarise data and depict distributions. It is based on
Matplotlib and can be used on both data frames and arrays.
Seaborn is used for basic plottings– bar graph, line charts and pie charts.
4. Plotly
Plotly is a collaborative, web-based analytics and graphing platform. It is one of the most
powerful libraries for ML, data science and AI-related operations. Plotly is publication-ready
and immersive and is used for data visualisation.
Plotly can easily import data to chart, allowing developers to make slide decks and
dashboards with ease. It is used for the development of tools like Dash and Chart Studio.
5. Matplotlib
Developed by John Hunter, Matplotlib is one of the most common libraries in the Python
community. It is used for creating static, animated and interactive data visualisations.
Matplotlib provides endless customisation and charts. It enables developers to use histograms
to scatter, customise and configure plots. The open-source library offers an object-oriented
API for integrating plots into applications.
6. SciPy
SciPy or Scientific Python is used for complex mathematics, science and engineering
problems. It is built on the NumPy extension and allows developers to manipulate and
visualise data.
SciPy provides user-friendly and efficient numerical routines for linear algebra, statistics,
integration and optimisation. Its applications include multidimensional image processing,
solving Fourier transforms and differential equations.
References:
Introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual,
L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing,
ISBN-9781789950069
Y. Daniel Liang, “Introduction to Programming using Python”, Pearson,2012.
Wes McKinney, “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython”,
O’Reilly, 2nd Edition,2018.
Jake VanderPlas, “Python Data Science Handbook: Essential Tools for Working with
Data”,
O’Reilly, 2017
PyCharm
Kite
Spyder
IDLE
Sublime Text 3
Visual Studio Code
Atom
Python code editors are designed for the developers to code and debug program easily. Using
these Python IDEs(Integrated Development Environment), you can manage a large codebase
and achieve quick deployment.
Developers can use these editors to create desktop or web application. The Python IDEs can
also be used by DevOps engineers for continuous Integration.
Following is a handpicked list of Top Python Code Editors, with popular features and latest
download links. The list contains both open-source(free) and premium tools.
1) PyCharm
PayCharm is a cross-platform IDE used for Python programming. It is one of the best Python
IDE editor that can be used on Windows, macOS, and Linux. This software contains API that
can be used by the developers to write their own Python plugins so that they can extend the
basic functionalities.
Price: Free/Paid
It is an intelligent Python code editor supports for CoffeeScript, JavaScript, CSS, and
TypeScript.
Provides smart search to jump to any file, symbol, or class.
Smart Code Navigation
This Python editor offers quick and safe refactoring of code.
It allows you to access PostgreSQL, Oracle, MySQL, SQL Server, and many other
databases from the IDE.
2) Kite
Kite is IDE for Python that automatically completes multiple line codes. This editor supports
more than 16 languages. It helps you to code faster with no hassle.
Price: Free
Features:
3) Spyder
Features:
It is one of the best Python IDE for Windows which allows you to run Python code by
cell, line, or file.
Plot a histogram or time-series, make changes in dateframe or numpy array.
It offers automatic code completion and horizontal/vertical splitting.
Find and eliminate bottlenecks
An interactive way to trace each step of Python code execution.
4) IDLE
IDLE (Integrated Development and Learning Environment) is a default editor that comes
with Python. It is one of the best Python IDE software which helps a beginner to learn Python
easily. IDLE software package is optional for many Linux distributions. The tool can be used
on Windows, macOS, and Unix.
Price: free
Features:
5) Sublime Text 3
Sublime Text 3 is a code editor which supports many languages including Python. It is one of
the best Python editor that has basic built-in support for Python. Customization of Sublime
Text 3 is available for creating create a full-fledged Python programming environment. The
editor supports OS X, Windows, and Linux operating systems.
Features:
Price: Free
Features:
The editor provides smart code completion based on function definition, imported
modules, as well as variable types.
You can work with Git as well as other SCM providers
Enable you to debug code from the editor.
Provides extensions to add new languages, debuggers, themes to gain the advantage
of additional services.
Atom is a useful code editor tool preferred by programmers due to its simple interface
compared to the other editors. Atom users can submit packages and them for the software.
Price: Free
Features:
8) Jupyter
Jupyter is a tool for people who have just started with data science. It is easy to use,
interactive data science IDE across many programming languages that just not work as an
editor, but also as an educational tool or presentation.
Price: Free
Features:
It is one of the best Python IDE that supports for Numerical simulation, data cleaning
machine learning data visualization, and statistical modeling.
Combine code, text, and images.
Support for many programming languages.
Integrated data science libraries (matplotlib, NumPy, Pandas).
PyDev is a third-party Python editor for Eclipse. It is one of the best IDE for Python which
can be used in not only Python but IronPython and Jython development.
Price: Free
Features:
10) Thonny
Thonny is an IDE for learning and teaching programming, specially designed with the
beginner Pythonista scripting environment. It is developed at The University of Tartu, which
you can download for free on the Bitbucket repository for Windows, Linux, and Mac.
Price: Free
Features:
Allows developers to view how their code and shell commands affect Python
variables.
It has a simple debugger.
It is one of the best IDE for Python that provides support for evaluating an expression.
Python function call opens a new window with separate local variables table as well
as code pointer.
Automatically spot syntax error.
Price: Wing Pro trial is free. Wind Personal and Wing 101 are paid versions.
Features:
12) ActivePython
Increase software development data science with a secure and supported Python distribution.
ActivePython is software consisting of the Python implementation CPython and a set of
various extensions to facilitate installation.
Price: Free for community, however, coder, team, business. Enterprise versions are paid.
Features:
It is one of the best IDE for Python which allows you to connect to your big data and
databases, including Redis, MySQL, Hadoop, and MongoDB.
Helps you to manage your data using, SciPy, Pandas, NumPy, and MatPlotLib.
Supports machine learning models like TensorFlow, Keras, and Theano.
Compatible with open-source Python so that you can avoid vendor lock-in.
Uses OpenSSL patch for security.
A file containing Python code, for example: example.py , is called a module, and its module
name would be example .
We use modules to break down large programs into small manageable and organized files.
Furthermore, modules provide reusability of code.
We can define our most used functions in a module and import it, instead of copying their
definitions into different programs.
result = a + b
return result
Here, we have defined a function add() inside a module named example . The function takes
in two numbers and returns their sum.
We can import the definitions inside a module to another module or the interactive interpreter
in Python.
We use the import keyword to do this. To import our previously defined module example ,
we type the following in the Python prompt.
>>> example.add(4,5.5)
9.5
Python has tons of standard modules. You can check out the full list of Python standard
modules and their use cases. These files are in the Lib directory inside the location
where you installed Python.
Standard modules can be imported the same way as we import our user-defined modules.
There are various ways to import modules. They are listed below..
We can import a module using the import statement and access the definitions inside it using
the dot operator as described above. Here is an example.
import math
print("The value of pi is", math.pi)
Modules provide us with a way to share reusable functions. A module is simply a “Python
file” which contains code we can reuse in multiple Python programs. A module may contain
functions, classes, lists, etc.
One of the many superpowers of Python is that it comes with a “rich standard library”. This
rich standard library contains lots of built-in modules. Hence, it provides a lot of reusable
code.
To name a few, Python contains modules like “os”, “sys”, “datetime”, “random”.
You can import and use any of the built-in modules whenever you like in your program.
Another superpower of Python is that it lets you take things in your own hands. You can
create your own functions and classes, put them inside modules and voila! You can now
include hundreds of lines of code into any program just by writing a simple import statement.
To create a module, just put the code inside a .py file. Let’s create one.
# my Python module
def greeting(x):
print("Hello,", x)
Write this code in a file and save the file with the name mypymodule.py. Now we have
created our own module.
Half of our job is over, now let’s learn how to import these modules.
We use the import keyword to import both built-in and user-defined modules in Python.
Let’s import our user-defined module from the previous section into our Python shell:
>>> import mypymodule
To call the greeting function of mypymodule, we simply need to use the dot notation:
>>> mypymodule.greeting("Techvidvan")
Output
Hello, Techvidvan
Similarly, we can import mypymodule into any Python file and call the greeting function as
we did above.
To call the randint function of random, we simply need to use the dot notation:
>>> random.randint(20, 100)
Output
63
The randint function of the random module returns a random number between a given range,
here (20 to 100).
We can import modules in various different ways to make our code more Pythonic.
This lets you give a shorter name to a module while using it in your program.
You can import a specific function, class, or attribute from a module rather than importing
the entire module. Follow the syntax below,
69
You can also import multiple attributes and functions from a module:
9.42477796076938
>>> print(sqrt(100))
10.0
>>>
Note that while importing from a module in this way, we don’t need to use the dot operator
while calling the function or using the attribute.
Importing everything from Python module
If we need to import everything from a module and we don’t want to use the dot operator, do
this:
9.42477796076938
>>> print(sqrt(100))
10.0
>>>
The dir() function will return the names of all the properties and methods present in a
module.
>>> dir(random)
>>>
If you have already imported a module but need to reload it, use the reload() method. This is
intended to be used in cases when you edit a source file of a module and need to test it
without leaving Python.
In Python 3.0 and above, you need to import imp standard library module to make use of this
function.
>>> imp.reload(random)
References:
Introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual,
L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing,
ISBN-9781789950069
Y. Daniel Liang, “Introduction to Programming using Python”, Pearson,2012.
Wes McKinney, “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython”,
O’Reilly, 2nd Edition,2018.
Jake VanderPlas, “Python Data Science Handbook: Essential Tools for Working with
Data”,
O’Reilly, 2017
Python packages
A python package is a collection of modules. Modules that are related to each other are
mainly put in the same package. When a module from an external package is required in a
program, that package can be imported and its modules can be put to use.
Any Python file, whose name is the module’s name property without the .py extension, is
a module.
When you import a module or a package, the object created by Python is always of type
module.
When you import a package, only the methods and the classes in the __init__.py file of that
package are directly visible.
Code
For example, let’s take the datetime module, which has a submodule
called date. When datetime is imported, it’ll result in an error, as shown
below:
References:
Introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual,
L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing,
ISBN-9781789950069
Y. Daniel Liang, “Introduction to Programming using Python”, Pearson,2012.
Wes McKinney, “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython”,
O’Reilly, 2nd Edition,2018.
Jake VanderPlas, “Python Data Science Handbook: Essential Tools for Working with
Data”,
O’Reilly, 2017
Python too supports file handling and allows users to handle files i.e., to read and write
files, along with many other file handling options, to operate on files. The concept of file
handling has stretched over various other languages, but the implementation is either
complicated or lengthy, but alike other concepts of Python, this concept here is also easy
and short. Python treats file differently as text or binary and this is important. Each line of
code includes a sequence of characters and they form text file. Each line of a file is
terminated with a special character, called the EOL or End of Line characters like comma
{,} or newline character. It ends the current line and tells the interpreter a new one has
begun. Let’s start with Reading and Writing files.
Working of open() function
We use open () function in Python to open a file in read or write mode. As explained
above, open ( ) will return a file object. To return a file object we use open() function along
with two arguments, that accepts file name and the mode, whether to read or write. So, the
syntax being: open(filename, mode). There are three kinds of mode, that Python provides
and how files can be opened:
“ r “, for reading.
“ w “, for writing.
“ a “, for appending.
“ r+ “, for both reading and writing
One must keep in mind that the mode argument is not mandatory. If not passed, then
Python will assume it to be “ r ” by default. Let’s look at this program and try to analyze
how the read mode works:
print (each)
The open command will open the file in the read mode and the for loop will print each line
present in the file.
print (file.read())
Another way to read a file is to call a certain number of characters like in the following
code the interpreter will read the first five characters of stored data and return it as a string:
print (file.read(5))
file = open('geek.txt','w')
file.close()
The close() command terminates all the resources in use and frees the system of this
particular program.
Working of append() mode
Let’s see how the append mode works:
# Python code to illustrate append() mode
file = open('geek.txt','a')
file.close()
There are also various other commands in file handling that is used to handle various tasks
like:
rstrip(): This function strips each line of a file off spaces from the right-hand side.
lstrip(): This function strips each line of a file off spaces from the left-hand side.
It is designed to provide much cleaner syntax and exceptions handling when you are
working with code. That explains why it’s good practice to use them with a statement
where applicable. This is helpful because using this method any files opened will be closed
automatically after one is done, so auto-cleanup.
Example:
data = file.read()
f.write("Hello World!!!")
split() using file handling
We can also split lines using file handling in Python. This splits the variable when space is
encountered. You can also split using any characters as we wish. Here is the code:
data = file.readlines()
word = line.split()
print (word)
There are also various other functions that help to manipulate the files and its contents. One
can explore various other functions in Python Docs.
References:
Introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual,
L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing,
ISBN-9781789950069
Y. Daniel Liang, “Introduction to Programming using Python”, Pearson,2012.
Wes McKinney, “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython”,
O’Reilly, 2nd Edition,2018.
Jake VanderPlas, “Python Data Science Handbook: Essential Tools for Working with
Data”,
O’Reilly, 2017
Error in Python can be of two types i.e. Syntax errors and Exceptions. Errors are the
problems in a program due to which the program will stop the execution. On the other
hand, exceptions are raised when some internal events occur which changes the normal
flow of the program.
Difference between Syntax Error and Exceptions
Syntax Error: As the name suggests this error is caused by the wrong syntax in the code. It
leads to the termination of the program.
Example:
amount = 10000
Output:
Exceptions: Exceptions are raised when the program is syntactically correct, but the code
resulted in an error. This error does not stop the execution of the program, however, it
changes the normal flow of the program.
Example:
Python3
marks = 10000
a = marks / 0
print(a)
Output:
In the above example raised the ZeroDivisionError as we are trying to divide a number by
0.
Note: Exception is the base class for all the exceptions in Python.
Try and Except Statement – Catching Exceptions
Try and except statements are used to catch and handle exceptions in Python. Statements
that can raise exceptions are kept inside the try clause and the statements that handle the
exception are written inside except clause.
Example: Let us try to access the array element whose index is out of bound and handle the
corresponding exception.
Python3
#Python 3
a = [1, 2, 3]
try:
except:
Output
Second element = 2
An error occurred
In the above example, the statements that can cause the error are placed inside the try
statement (second print statement in our case). The second print statement tries to access
the fourth element of the list which is not there and this throws an exception. This
exception is then caught by the except statement.
Catching Specific Exception
A try statement can have more than one except clause, to specify handlers for different
exceptions. Please note that at most one handler will be executed. For example, we can add
IndexError in the above code. The general syntax for adding specific exceptions are –
try:
# statement(s)
except IndexError:
# statement(s)
except ValueError:
# statement(s)
Example: Catching specific exception in Python
Python3
# except statement
# Python 3
def fun(a):
if a < 4:
b = a/(a-3)
print("Value of b = ", b)
try:
fun(3)
fun(5)
except ZeroDivisionError:
except NameError:
Output
# Python 3
try:
c = ((a+b) / (a-b))
except ZeroDivisionError:
print ("a/b result in 0")
else:
print (c)
AbyB(2.0, 3.0)
AbyB(3.0, 3.0)
Output:
-5.0
a/b result in 0
Finally Keyword in Python
Python provides a keyword finally, which is always executed after the try and except
blocks. The finally block always executes after normal termination of try block or after try
block terminates due to some exception.
Syntax:
try:
# Some Code....
except:
# optional block
# Handling of exception (if required)
else:
# execute if no exception
finally:
# Some code .....(always executed)
Example:
Python3
try:
print(k)
except ZeroDivisionError:
finally:
Output:
Can't divide by zero
This is always executed
Raising Exception
The raise statement allows the programmer to force a specific exception to occur. The sole
argument in raise indicates the exception to be raised. This must be either an exception
instance or an exception class (a class that derives from Exception).
Python3
try:
except NameError:
The output of the above code will simply line printed as “An exception” but a Runtime
error will also occur in the last due to raise statement in the last line. So, the output on your
command line will look like
Traceback (most recent call last):
File "/home/d6ec14ca595b97bff8d8034bbf212a9f.py", line 5, in <module>
raise NameError("Hi there") # Raise Error
NameError: Hi there
References:
Introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual,
L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing,
ISBN-9781789950069
Y. Daniel Liang, “Introduction to Programming using Python”, Pearson,2012.
Wes McKinney, “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython”,
O’Reilly, 2nd Edition,2018.
Jake VanderPlas, “Python Data Science Handbook: Essential Tools for Working with
Data”,
O’Reilly, 2017
interrelated variables and functions. These variables are often referred to as properties of the
object and functions are referred to as the behavior of the objects. These objects provide a
For example, a car can be an object. If we consider the car as an object then its properties
would be – its color, its model, its price, its brand, etc. And its behavior/function would be
An object-oriented paradigm is to design the program using classes and objects. The object is
related to real-word entities such as book, house, pencil, etc. The oops concept focuses on
writing the reusable code. It is a widespread technique to solve the problem by creating
objects.
o Class
o Object
o Method
o Inheritance
o Polymorphism
o Data Abstraction
o Encapsulation
Class
The class can be defined as a collection of objects. It is a logical entity that has some specific
attributes and methods. For example: if you have an employee class, then it should contain an
attribute and method, i.e. an email id, name, age, salary, etc.
Syntax
1. class ClassName:
2. <statement-1>
3. .
4. .
5. <statement-N>
Object
The object is an entity that has state and behavior. It may be any real-world object like the
mouse, keyboard, chair, table, pen, etc.
Everything in Python is an object, and almost everything has attributes and methods. All
functions have a built-in attribute __doc__, which returns the docstring defined in the
function source code.
When we define a class, it needs to create an object to allocate the memory. Consider the
following example.
Example:
1. class car:
2. def __init__(self,modelname, year):
3. self.modelname = modelname
4. self.year = year
5. def display(self):
6. print(self.modelname,self.year)
7.
8. c1 = car("Toyota", 2016)
9. c1.display()
Output:
Toyota 2016
In the above example, we have created the class named car, and it has two attributes
modelname and year. We have created a c1 object to access the class attribute. The c1 object
will allocate memory for these values. We will learn more about class and object in the next
tutorial.
Method
The method is a function that is associated with an object. In Python, a method is not unique
to class instances. Any object type can have methods.
Inheritance
Inheritance is the most important aspect of object-oriented programming, which simulates the
real-world concept of inheritance. It specifies that the child object acquires all the properties
and behaviors of the parent object.
By using inheritance, we can create a class which uses all the properties and behavior of
another class. The new class is known as a derived class or child class, and the one whose
properties are acquired is known as a base class or parent class.
Polymorphism
Polymorphism contains two words "poly" and "morphs". Poly means many, and morph
means shape. By polymorphism, we understand that one task can be performed in different
ways. For example - you have a class animal, and all animals speak. But they speak
differently. Here, the "speak" behavior is polymorphic in a sense and depends on the animal.
So, the abstract "animal" concept does not actually "speak", but specific animals (like dogs
and cats) have a concrete implementation of the action "speak".
Encapsulation
Data Abstraction
Data abstraction and encapsulation both are often used as synonyms. Both are nearly
synonyms because data abstraction is achieved through encapsulation.
Abstraction is used to hide internal details and show only functionalities. Abstracting
something means to give names to things so that the name captures the core of what a
function or a whole program does.
2. It makes the development and maintenance easier. In procedural programming, It is not easy to
maintain the codes when the project becomes
lengthy.
3. It simulates the real world entity. So real-world It doesn't simulate the real world. It works on
problems can be easily solved through oops. step by step instructions divided into small parts
called functions.
4. It provides data hiding. So it is more secure than Procedural language doesn't provide any proper
procedural languages. You cannot access private way for data binding, so it is less secure.
data from anywhere.
References:
Introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual,
L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing,
ISBN-9781789950069
Y. Daniel Liang, “Introduction to Programming using Python”, Pearson,2012.
Wes McKinney, “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython”,
O’Reilly, 2nd Edition,2018.
Jake VanderPlas, “Python Data Science Handbook: Essential Tools for Working with
Data”,
O’Reilly, 2017
Inheritance
Inheritance models what is called an is a relationship. This means that when you have
a Derived class that inherits from a Base class, you created a relationship where Derived is
a specialized version of Base.
Inheritance is represented using the Unified Modeling Language or UML in the following
way:
Classes are represented as boxes with the class name on top. The inheritance relationship is
represented by an arrow from the derived class pointing to the base class. The
word extends is usually added to the arrow.
Classes that inherit from another are called derived classes, subclasses, or subtypes.
Classes from which other classes are derived are called base classes or super classes.
A derived class is said to derive, inherit, or extend a base class.
Let’s say you have a base class Animal and you derive from it to create a Horse class. The
inheritance relationship states that a Horse is an Animal. This means that Horse inherits
the interface and implementation of Animal, and Horse objects can be used to
replace Animal objects in the application.
This is known as the Liskov substitution principle. The principle states that “in a computer
program, if S is a subtype of T, then objects of type T may be replaced with objects of
type S without altering any of the desired properties of the program”.
Everything in Python is an object. Modules are objects, class definitions and functions are
objects, and of course, objects created from classes are objects too.
Inheritance is a required feature of every object oriented programming language. This means
that Python supports inheritance, and as you’ll see later, it’s one of the few languages that
supports multiple inheritance.
>>>
>>>
>>> c = MyClass()
>>> dir(c)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
'__str__', '__subclasshook__', '__weakref__']
dir() returns a list of all the members in the specified object. You have not declared any
members in MyClass, so where is the list coming from? You can find out using the
interactive interpreter:
>>>
>>> o = object()
>>> dir(o)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__',
'__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__']
As you can see, the two lists are nearly identical. There are some additional members
in MyClass like __dict__ and __weakref__, but every single member of the object class is
also present in MyClass.
This is because every class you create in Python implicitly derives from object. You could
be more explicit and write class MyClass(object):, but it’s redundant and unnecessary.
Note: In Python 2, you have to explicitly derive from object for reasons beyond the scope of
this article, but you can read about it in the New-style and classic classes section of the
Python 2 documentation.
Exceptions Are an Exception
Every class that you create in Python will implicitly derive from object. The exception to
this rule are classes used to indicate errors by raising an exception.
You can see the problem using the Python interactive interpreter:
>>>
BaseException is a base class provided for all error types. To create a new error type, you
must derive your class from BaseException or one of its derived classes. The convention in
Python is to derive your custom error types from Exception, which in turn derives
from BaseException.
>>>
In this section, you’ll start modeling an HR system. The example will demonstrate the use of
inheritance and how derived classes can provide a concrete implementation of the base class
interface.
The HR system needs to process payroll for the company’s employees, but there are different
types of employees depending on how their payroll is calculated.
# In hr.py
class PayrollSystem:
def calculate_payroll(self, employees):
print('Calculating Payroll')
print('===================')
for employee in employees:
print(f'Payroll for: {employee.id} - {employee.name}')
print(f'- Check amount: {employee.calculate_payroll()}')
print('')
The PayrollSystem implements a .calculate_payroll() method that takes a collection of
employees and prints their id, name, and check amount using
the .calculate_payroll() method exposed on each employee object.
Now, you implement a base class Employee that handles the common interface for every
employee type:
# In hr.py
class Employee:
def __init__(self, id, name):
self.id = id
self.name = name
Employee is the base class for all employee types. It is constructed with an id and a name.
What you are saying is that every Employee must have an id assigned as well as a name.
The HR system requires that every Employee processed must provide
a .calculate_payroll() interface that returns the weekly salary for the employee. The
implementation of that interface differs depending on the type of Employee.
For example, administrative workers have a fixed salary, so every week they get paid the
same amount:
# In hr.py
class SalaryEmployee(Employee):
def __init__(self, id, name, weekly_salary):
super().__init__(id, name)
self.weekly_salary = weekly_salary
def calculate_payroll(self):
return self.weekly_salary
You create a derived class SalaryEmployee that inherits Employee. The class is initialized
with the id and name required by the base class, and you use super() to initialize the
members of the base class. You can read all about super() in Supercharge Your Classes
With Python super().
The class provides the required .calculate_payroll() method used by the HR system. The
implementation just returns the amount stored in weekly_salary.
The company also employs manufacturing workers that are paid by the hour, so you add
an HourlyEmployee to the HR system:
# In hr.py
class HourlyEmployee(Employee):
def __init__(self, id, name, hours_worked, hour_rate):
super().__init__(id, name)
self.hours_worked = hours_worked
self.hour_rate = hour_rate
def calculate_payroll(self):
return self.hours_worked * self.hour_rate
The HourlyEmployee class is initialized with id and name, like the base class, plus
the hours_worked and the hour_rate required to calculate the payroll.
The .calculate_payroll() method is implemented by returning the hours worked times
the hour rate.
Finally, the company employs sales associates that are paid through a fixed salary plus a
commission based on their sales, so you create a CommissionEmployee class:
# In hr.py
class CommissionEmployee(SalaryEmployee):
def __init__(self, id, name, weekly_salary, commission):
super().__init__(id, name, weekly_salary)
self.commission = commission
def calculate_payroll(self):
fixed = super().calculate_payroll()
return fixed + self.commission
You derive CommissionEmployee from SalaryEmployee because both classes have
a weekly_salary to consider. At the same time, CommissionEmployee is initialized with
a commission value that is based on the sales for the employee.
The problem with accessing the property directly is that if the implementation
of SalaryEmployee.calculate_payroll() changes, then you’ll have to also change the
implementation of CommissionEmployee.calculate_payroll(). It’s better to rely on the
already implemented method in the base class and extend the functionality as needed.
You created your first class hierarchy for the system. The UML diagram of the classes looks
like this:
The diagram shows the inheritance hierarchy of the classes. The derived classes implement
the IPayrollCalculator interface, which is required by the PayrollSystem.
The PayrollSystem.calculate_payroll() implementation requires that
the employee objects passed contain an id, name,
and calculate_payroll() implementation.
Interfaces are represented similarly to classes with the word interface above the interface
name. Interface names are usually prefixed with a capital I.
The application creates its employees and passes them to the payroll system to process
payroll:
# In program.py
import hr
$ python program.py
Calculating Payroll
===================
Payroll for: 1 - John Smith
- Check amount: 1500
References:
Introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual,
L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing,
ISBN-9781789950069
Y. Daniel Liang, “Introduction to Programming using Python”, Pearson,2012.
Wes McKinney, “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython”,
O’Reilly, 2nd Edition,2018.
Jake VanderPlas, “Python Data Science Handbook: Essential Tools for Working with
Data”,
O’Reilly, 2017