UNIT 3 (Chapter 2) Pandas
UNIT 3 (Chapter 2) Pandas
Data Manipulation with Pandas – Data indexing and selection ,Operating on data,
Missing data, Hierarchical indexing, Combining Datasets, Aggregation and
Grouping, Pivot Tables.
Introduction to Pandas
What is Pandas?
Pandas is a Python library used for working with data sets.
It has functions for analyzing, cleaning, exploring, and manipulating data.
The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and
was created by Wes McKinney in 2008.
Why Use Pandas?
Pandas allows us to analyze big data and make conclusions based on statistical theories.
Pandas can clean messy data sets, and make them readable and relevant.
Relevant data is very important in data science.
Installation of Pandas
If you have Python and PIP already installed on a system, then installation of Pandas is very easy.
Install it using this command:
C:\Users\Your Name>pip install pandas
If this command fails, then use a python distribution that already has Pandas installed like,
Anaconda, Spyder etc.
Import Pandas
Once Pandas is installed, import it in your applications by adding the import keyword:
import pandas
Example
import pandas
mydataset = {'cars': ["BMW", "Volvo", "Ford"],'passings':
[3, 7, 2]}
myvar = pandas.DataFrame(mydataset)
print(myvar)
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Pandas as pd
Pandas is usually imported under the pd alias.
alias: In Python alias are an alternate name for referring to the same thing.
Create an alias with the as keyword while importing:
import pandas as pd
Example
import pandas as pd
mydataset = { 'cars':
["BMW", "Volvo", "Ford"], 'passings': [3, 7, 2]}
myvar = pd.DataFrame(mydataset)
print(myvar)
Checking Pandas Version
The version string is stored under __version__ attribute.
Example
import pandas as pd
print(pd.__version__)
Pandas Objects
Pandas objects can be thought of as enhanced versions of NumPy structured arrays in which the
rows and columns are identified with labels rather than simple integer indices.
Three fundamental Pandas data structures:
1. Series
2. DataFrame
3. Index.
Output:
0 0.25
1 0.50
2 0.75
3 1.00
The Series wraps both a sequence of values and a sequence of indices, which we can access with
the values and index attributes.
print(data.values)
print(data.index)
Output:
[0.25 0.5 0.75 1. ]
RangeIndex(start=0, stop=4, step=1)
The essential difference between NumPy one-dimensional array and pandas Series is the
presence of the index: while the NumPy array has an implicitly defined integer index used to
access the values, the Pandas Series has an explicitly defined index associated with the values.
This explicit index definition gives the Series object additional capabilities. For example, the
index need not be an integer, but can consist of values of any desired type.
Example:
data = pd.Series([0.25, 0.5, 0.75, 1.0],index=['a', 'b', 'c', 'd'])
print(data)
Output:
a 0.25
b 0.50
c 0.75
d 1.00
We can even use non-contiguous or non-sequential indices:
Example:
data = pd.Series([0.25, 0.5, 0.75, 1.0], index=[2, 5, 3, 7])
print(data)
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Output:
2 0.25
5 0.50
3 0.75
7 1.00
Example:
import pandas as pd
import numpy as np
arr=np.arange(10,60,10)
li=[10,20,30,40,50]
s=10
dic={'Ten':10,'Twenty':20,'Thirty':30,'Forty':40,'Fifty':50}
ser1 = pd.Series(arr)#A one-dimensional ndarray
ser2 = pd.Series(li)# A Python list
ser3 = pd.Series(s)#A scalar value
ser4 =pd.Series(s,index=['a','b','c','d','e'])
ser5 = pd.Series(dic) #A Python dictionary
print('Numpy 1-D array is converted into Pandas Series:')
print(ser1)
print('--------------------------------------------------')
print('Python list is converted into Pandas Series:')
print(ser2)
print('--------------------------------------------------')
print('Scalar Value is converted into Pandas Series:')
print(ser3)
print('--------------------------------------------------')
print('Scalar Value is converted into Pandas Series with explicit indexing:')
print(ser4)
print('--------------------------------------------------')
print('Python dictionary is converted into Pandas Series with explicit indexing:')
print(ser5)
Output:
Numpy 1-D array is converted into Pandas Series:
0 10
1 20
2 30
3 40
4 50
dtype: int32
--------------------------------------------------
Python list is converted into Pandas Series:
0 10
1 20
2 30
3 40
4 50
dtype: int64
--------------------------------------------------
Scalar Value is converted into Pandas Series:
0 10
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
dtype: int64
--------------------------------------------------
Scalar Value is converted into Pandas Series with explicit indexing:
a 10
b 10
c 10
d 10
e 10
dtype: int64
--------------------------------------------------
Python dictionary is converted into Pandas Series with explicit indexing:
Ten 10
Twenty 20
Thirty 30
Forty 40
Fifty 50
dtype: int64
Example:
#Pandas DataFrame
import pandas as pd
print('Data Frame:')
d=pd.DataFrame([[10,20],[30,40],[50,60]])
print(d)
d=pd.DataFrame([[10,20],[30,40],[50,60]],index=['row1','row2','row3'])
print('==========================================================')
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Output:
Data Frame:
0 1
0 10 20
1 30 40
2 50 60
=============================================================
Data Frame with explicit indexing for row:
0 1
row1 10 20
row2 30 40
row3 50 60
=============================================================
Data Frame with explicit indexing for rows and columns:
col1 col2
row1 10 20
row2 30 40
row3 50 60
Output:
Marks
Kumar 89
Rao 78
Ali 67
Singh 96
Output:
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Output:
Branch Address
Sajid CSE SAP
Wahid EEE NRT
Hafeez MECH GNT
Output:
0 1 2
row1 10 11 12
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
row2 13 14 15
Output:
A B
row1 0 0.0
row2 0 0.0
row3 0 0.0
Example:
import pandas as pd
rind = pd.Index(['row1','row2','row3','row4'])
cind =pd.Index(['col1'])
ser = pd.Series([100,200,300,400],index=rind)
df = pd.DataFrame(ser,columns=cind)
print(df)
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Output:
col1
row1 100
row2 200
row3 300
row4 400
Example:
#Index Object
import pandas as pd
rind=pd.Index(['row1','row2','row3'])
cind=['col1','col2']
data1=pd.DataFrame([[10,20],[30,40],[50,60]],rind,cind)
data2=pd.DataFrame([[1,2],[3,4],[5,6]],rind,cind)
data3=pd.DataFrame([[100,200],[300,400],[500,600]],rind,cind)
print(data1)
print("--------------------------")
print(data2)
print("--------------------------")
print(data3)
Output:
col1 col2
row1 10 20
row2 30 40
row3 50 60
--------------------------
col1 col2
row1 1 2
row2 3 4
row3 5 6
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
--------------------------
col1 col2
row1 100 200
row2 300 400
row3 500 600
Index Preservation:
#Operating on Data in pandas
#index preservation in series and dataframe
import numpy as np
import pandas as pd
s=pd.Series([10,20,30,40])
print('Series:')
print(s)
df=pd.DataFrame(np.arange(1,13,1).reshape(3,4))
print('DataFrame:')
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
print(df)
print("===================================================")
print("Adding 5 to individual row of an array in series")
print(np.add(s,5))
print("===================================================")
print("Adding 10 to individual element of an array in dataframe")
print(np.add(df,10))
print('================================================')
print('Trignometric Function sin applied on series:')
print(np.sin(s))
print('Logarithemic function applied on dataframe:')
print(np.log(df[0][0]))
Output:
Series:
0 10
1 20
2 30
3 40
dtype: int64
DataFrame:
0 1 2 3
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12
===================================================
Adding 5 to individual row of an array in series
0 15
1 25
2 35
3 45
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
dtype: int64
===================================================
Adding 10 to individual element of an array in dataframe
0 1 2 3
0 11 12 13 14
1 15 16 17 18
2 19 20 21 22
================================================
Trignometric Function sin applied on series:
0 -0.544021
1 0.912945
2 -0.988032
3 0.745113
dtype: float64
Logarithemic function applied on dataframe:
0.0
Example:
#Index Alignment in Series
import numpy as np
import pandas as pd
A=pd.Series([2,4,6],index=[0,1,2])
B=pd.Series([1,3,5],index=[1,2,3])
print(A.add(B))
print("===========================================================")
print("Fill value for any elements in A or B that might be missing")
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Output:
0 NaN
1 5.0
2 9.0
3 NaN
dtype: float64
===========================================================
Fill value for any elements in A or B that might be missing
0 2.0
1 5.0
2 9.0
3 5.0
dtype: float64
Example:
#Index Alignment in DataFrame
import numpy as np
import pandas as pd
A=pd.DataFrame(np.arange(1,5,1).reshape(2,2), columns=list('AB'))
B=pd.DataFrame(np.arange(1,10,1).reshape(3,3), columns=list('BAC'))
print("DataFrame A:")
print("-------------------")
print(A)
print("DataFrame B:")
print("-------------------")
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
print(B)
print("Addition of DataFrame A and B:")
print("-----------------------------")
print(A.add(B))
Output:
DataFrame A:
-------------------
A B
0 1 2
1 3 4
DataFrame B:
-------------------
B A C
0 1 2 3
1 4 5 6
2 7 8 9
Addition of DataFrame A and B:
-----------------------------
A B C
0 3.0 3.0 NaN
1 8.0 8.0 NaN
2 NaN NaN NaN
import pandas as pd
s=pd.Series([10,20])
df=pd.DataFrame([[100,200],[300,400]])
print("Series:")
print('----------')
print(s)
print("\nDataFrame:")
print('-----------')
print(df)
print("\nSubtraction of DataFrame with Series:")
print("-------------------------------------")
print(df.subtract(s))
print("\nSubtraction of DataFrame with Series at Axis=0: ")
print("-------------------------------------")
print(df.subtract(s, axis=0))
Output:
Series:
----------
0 10
1 20
dtype: int64
DataFrame:
-----------
0 1
0 100 200
1 300 400
0 1
0 90 180
1 290 380
print('sum(x)=',np.nansum(x))
y=np.array([10,20,30,np.nan])
print('----------------------------------------')
print('y=',y,'\n')
print('Sum of elements in numpy array y:')
print('sum(y)=',np.nansum(y))
print('----------------------------------------')
print('Addition of Numpy Array x and y:')
print(x+y)
Output:
x= [ 1. 2. nan 4.]
print(s_nan)
Output:
0 NaN
1 NaN
2 NaN
dtype: float64
For numeric columns, None is converted to nan when a DataFrame or Series containing None is
created, or None is assigned to an element.
Example:
import pandas as pd
s_none_float = pd.Series([None, 10, 20])
print(s_none_float)
Output:
0 NaN
1 10.0
2 20.0
dtype: float64
Output:
0 None
1 abc
2 xyz
dtype: object
Example:
import pandas as pd
s_none_float = pd.Series([None, 10, 20])
print(s_none_float)
print('--------------------------------')
print(s_none_float.isnull())
print('------------------------------------------')
print(s_none_float.notnull())
Output:
0 NaN
1 10.0
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
2 20.0
dtype: float64
--------------------------------
0 True
1 False
2 False
dtype: bool
------------------------------------------
0 False
1 True
2 True
dtype: bool
Output:
0 NaN
1 10.0
2 20.0
dtype: float64
------------------------------------------
Null Values dropped from the series:
1 10.0
2 20.0
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
dtype: float64
Filling null values:
The fillna() method replaces the NULL values with a specified value.
Example:
import pandas as pd
ser = pd.Series([np.nan, 10, 20,30])
print(ser)
print('-------------------------------')
print('Series Null Values are filled with 0:')
print(ser.fillna(0))
Output:
0 NaN
1 10.0
2 20.0
3 30.0
dtype: float64
-------------------------------
Series Null Values are filled with 0:
0 0.0
1 10.0
2 20.0
3 30.0
dtype: float64
ffill():
The ‘ffill’ method fills the missing value with the last valid value before that missing value in
the data sequence.
Example:
import numpy as np
import pandas as pd
ex1 = pd.Series([1,3,np.nan,4])
print(ex1.ffill())
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
or
print(ex1.fillna(method='ffill') )
Output:
0 1.0
1 3.0
2 3.0
3 4.0
dtype: float64
Output:
0 1.0
1 3.0
2 4.0
3 4.0
dtype: float64
Hierarchical Indexing
Hierarchical indexing (also known as multi-indexing) is used to incorporate multiple index
levels within a single index.
In this way, higher-dimensional data can be compactly represented within the familiar one-
dimensional Series and two-dimensional DataFrame objects.
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Output:
1 a 10
b 20
c 30
2 a 40
b 50
c 60
dtype: int64
----------------------------------------
index1 index2
1 a 10
b 20
c 30
2 a 40
b 50
c 60
dtype: int64
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Example 2:
#Multiply Indexed DataFrame
import numpy as np
import pandas as pd
data = [[25,24],[28,26],[29,28],[27,26],[30,29],[28,27]]
ind = [['1201','1201','1264','1264','12C7','12C7'],['mid1','mid2','mid1','mid2','mid1','mid2']]
col = ['DS','DP']
df = pd.DataFrame(data,index=ind,columns=col)
print(df)
print('--------------------------------------------------')
df.index.names =['RollNo ','Mid Result']
print(df)
Output:
DS DP
1201 mid1 25 24
mid2 28 26
1264 mid1 29 28
mid2 27 26
12C7 mid1 30 29
mid2 28 27
--------------------------------------------------
DS DP
RollNo Mid Result
1201 mid1 25 24
mid2 28 26
1264 mid1 29 28
mid2 27 26
12C7 mid1 30 29
mid2 28 27
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Combining Datasets
Some of the most interesting studies of data come from combining different data sources.
These operations can involve anything from very straightforward concatenation of two
different datasets, to more complicated database- style joins and merges that correctly handle
any overlaps between the dataset.
These operations can be:
simple concatenation of Series and DataFrames with the pd.concat function
in-memory merges and joins implemented in Pandas.
Output:
Concatenation of Series 1 and Series 2:
1 A
2 B
3 C
1 D
2 E
3 F
dtype: object
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Example-2:
#Combining Datasets
#Concatenation in DataFrame
import pandas as pd
df1 =pd.DataFrame([[10,20],[30,40]],index=[1,2],columns=['A','B'])
df2 =pd.DataFrame([[50,60],[70,80]],index=[1,2],columns=['A','B'])
print(df1)
print('---------------------------')
print(df2)
print('---------------------------')
print(pd.concat([df1, df2]))
Output:
A B
1 10 20
2 30 40
---------------------------
A B
1 50 60
2 70 80
---------------------------
A B
1 10 20
2 30 40
1 50 60
2 70 80
By default, the concatenation takes place row-wise within the DataFrame (i.e., axis=0). Like
np.concatenate, pd.concat allows specification of an axis along which concatenation will take
place.
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Example-3:
#Axis wise Concatenation in DataFrame
import pandas as pd
df1 =pd.DataFrame([[10,20],[30,40]],index=[1,2],columns=['A','B'])
df2 =pd.DataFrame([[50,60],[70,80]],index=[1,2],columns=['A','B'])
print(df1)
print('-------------------------------------')
print(df2)
print('-------------------------------------')
print(pd.concat([df1, df2],axis=1))
Output:
A B
1 10 20
2 30 40
-------------------------------------
A B
1 50 60
2 70 80
-------------------------------------
A B A B
1 10 20 50 60
2 30 40 70 80
Example-4:
#Axis wise Concatenation in DataFrame
import pandas as pd
df1 =pd.DataFrame([[10,20],[30,40]],index=[1,2],columns=['A','B'])
df2 =pd.DataFrame([[50,60],[70,80]],index=[3,4],columns=['C','D'])
print(df1)
print('-------------------------------------')
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
print(df2)
print('-------------------------------------')
print(pd.concat([df1, df2],axis=1))
Output:
A B
1 10 20
2 30 40
-------------------------------------
C D
3 50 60
4 70 80
-------------------------------------
A B C D
1 10.0 20.0 NaN NaN
2 30.0 40.0 NaN NaN
3 NaN NaN 50.0 60.0
4 NaN NaN 70.0 80.0
append()
Series and DataFrame objects have an append method that can accomplish the concatenation in
fewer keystrokes.
For example, rather than calling pd.concat([df1, df2]), we can simply call df1.append(df2):
print(df1);
print(df2);
print(df1.append(df2))
One-to-one joins
Many-to-one joins
Many-to-many joins
One – to – one joins
The simplest type of merge expression is the one-to-one join, which is in many ways very similar
to the column-wise concatenation.
Example:
#Merging Data Frames
#one to one join
import pandas as pd
df1 = pd.DataFrame({'employee': ['Bob', 'Jake', 'Lisa', 'Sue'], 'group': ['Accounting',
'Engineering', 'Engineering', 'HR']})
df2 = pd.DataFrame({'employee': ['Lisa', 'Bob', 'Jake', 'Sue'], 'hire_date': [2004, 2008, 2012,
2014]})
print(df1)
print('-------------------------------')
print(df2)
print('-------------------------------')
df3=pd.merge(df1,df2)
print(df3)
Output:
employee group
0 Bob Accounting
1 Jake Engineering
2 Lisa Engineering
3 Sue HR
-------------------------------
employee hire_date
0 Lisa 2004
1 Bob 2008
2 Jake 2012
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
3 Sue 2014
-------------------------------
employee group hire_date
0 Bob Accounting 2008
1 Jake Engineering 2012
2 Lisa Engineering 2004
3 Sue HR 2014
Many-to-one joins
Many-to-one joins are joins in which one of the two key columns contains duplicate entries. For
the many-to-one case, the resulting DataFrame will preserve those duplicate entries as
appropriate.
Example:
#Many to one join
import pandas as pd
df1 = pd.DataFrame({'employee': ['Bob', 'Jake', 'Lisa', 'Sue'], 'group': ['Accounting',
'Engineering', 'Engineering', 'HR']})
df2 = pd.DataFrame({'employee': ['Lisa', 'Bob', 'Jake', 'Sue'], 'hire_date': [2004, 2008, 2012,
2014]})
df3=pd.merge(df1,df2)
print(df3)
print('-------------------------------')
df4 = pd.DataFrame({'group': ['Accounting', 'Engineering', 'HR'], 'supervisor': ['Carly', 'Guido',
'Steve']})
print(pd.merge(df3,df4))
Output:
employee group hire_date
0 Bob Accounting 2008
1 Jake Engineering 2012
2 Lisa Engineering 2004
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
3 Sue HR 2014
-------------------------------
employee group hire_date supervisor
0 Bob Accounting 2008 Carly
1 Jake Engineering 2012 Guido
2 Lisa Engineering 2004 Guido
3 Sue HR 2014 Steve
The resulting DataFrame has an additional column with the “supervisor” information, where the
information is repeated in one or more locations as required by the inputs.
Many-to-many joins
Many-to-many joins are a bit confusing conceptually, but are nevertheless well defined. If the
key column in both the left and right array contains duplicates, then the result is a many-to-many
merge.
Example:
import pandas as pd
df1 = pd.DataFrame({'employee': ['Bob', 'Jake', 'Lisa', 'Sue'], 'group': ['Accounting',
'Engineering', 'Engineering', 'HR']})
df5 = pd.DataFrame({'group': ['Accounting', 'Accounting', 'Engineering', 'Engineering', 'HR',
'HR'], 'skills': ['math', 'spreadsheets', 'coding', 'linux', 'spreadsheets', 'organization']})
df6=pd.merge(df1,df5)
print(df6)
Output:
employee group skills
0 Bob Accounting math
1 Bob Accounting spreadsheets
2 Jake Engineering coding
3 Jake Engineering linux
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Example:
#Aggreagation in pandas
import pandas as pd
ser1=pd.Series([1,2,3,4,5])
print('Mean Value of Series:')
print(ser1.mean())
print('-----------------------')
print('Minimum Value of the Series:')
print(ser1.min())
print('-----------------------')
print('Maximum Value of the Series:')
print(ser1.max())
df=pd.DataFrame([[1,2,3],[4,5,6]])
print('-----------------------')
print('Maximum Value of the DataFrame:')
print(df.max())
Output:
Mean Value of Series:
3.0
-----------------------
Minimum Value of the Series:
1
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
-----------------------
Maximum Value of the Series:
5
-----------------------
Maximum Value of the DataFrame:
0 4
1 5
2 6
dtype: int64
Pandas Series and DataFrames include all of the common aggregates .In addition, there is a
convenience method describe() that computes several common aggregates for each column and
returns the result.
Example:
#Describe function
import pandas as pd
df=pd.DataFrame([[1,2,3],[4,5,6]])
print(df.describe())
Output:
0 1 2
count 2.00000 2.00000 2.00000
mean 2.50000 3.50000 4.50000
std 2.12132 2.12132 2.12132
min 1.00000 2.00000 3.00000
25% 1.75000 2.75000 3.75000
50% 2.50000 3.50000 4.50000
75% 3.25000 4.25000 5.25000
max 4.00000 5.00000 6.00000
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Example:
#group by function
import pandas as pd
import numpy as np
df = pd.DataFrame({'key':['A','B','C','A','B','C'],'data':np.arange(1,7)},columns=['key','data'])
print(df)
print('----------------------------------------')
print('Applying group by function on data frame:')
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
print(df.groupby('key').sum())
Output:
key data
0 A 1
1 B 2
2 C 3
3 A 4
4 B 5
5 C 6
----------------------------------------
Applying group by function on data frame:
data
key
A 5
B 7
C 9
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
Pivot Tables
A pivot table is a similar to GroupBy operation that is commonly seen in spreadsheets and other
programs that operate on tabular data.
The pivot table takes simple column wise data as input, and groups the entries into a two-
dimensional table that provides a multidimensional summarization of the data.
We can think of pivot tables as essentially a multidimensional version of GroupBy aggregation.
i.e., we can split-apply- combine, but both the split and the combine happen across not a one
dimensional index, but across a two-dimensional grid.
Pivot Table Syntax: The full call signature of the pivot_table method of DataFrames is as
follows:
DataFrame.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean',
fill_value=None, margins=False, dropna=True, margins_name='All')
where
data : pandas dataframe
index : feature that allows to group data
values : feature to aggregates on
columns: displays the values horizontally on top of the resultant table fill_value and
dropna, have to do with missing data
The aggfunc keyword controls what type of aggregation is applied, which is a mean by
default.
margins_name: compute totals along each grouping.
Example:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Name':['Kumar','Rao','Ali','Singh'],
'Job':['FullTimeEmployee','Intern','PartTime Employee','FullTimeEmployee'],
'Dept':['Admin','Tech','Admin','management'],
'YOJ':[2018,2019,2018,2010],'Sal':[20000,50000,10000,20000]})
UNIT-3 Python for Data Handling (Chapter-2 Pandas)
print(df.to_string())
output = pd.pivot_table(data=df,index=['Job'],columns = ['Dept'],values ='Sal',aggfunc ='mean')
print('\n-------------------------------------------------------\n')
print(output.to_string())
Output:
Name Job Dept YOJ Sal
0 Kumar FullTimeEmployee Admin 2018 20000
1 Rao Intern Tech 2019 50000
2 Ali PartTime Employee Admin 2018 10000
3 Singh FullTimeEmployee management 2010 20000
-------------------------------------------------------