Pandas
Pandas
After installing pandas on your system, you'll need to import the library.
This module is typically imported as follows:
5. Introducing Pandas Objects
6. What is a Series?
Pandas Series is a labelled one-dimensional array that can hold any type of data
(integer, string, float, Python objects, and so on).
Pandas Series is simply a column in an Excel spreadsheet.
Using the Series() method, we can easily convert a list, tuple, or dictionary into a
Series.
import pandas as pd
# Create dataframe
info = pd.DataFrame({"P":[4, 7, 1, 8, 9],
"Q":[6, 8, 10, 15, 11],
"R":[17, 13, 12, 16, 14],
"S":[15, 19, 7, 21, 9]},
index =["Parker", "William", "Smith", "Terry", "Phill"])
#Print dataframe
Info
Now, we can use the dataframe.reindex() function to reindex the dataframe.
10. Pandas Sort
There are two kinds of sorting available in Pandas. They are –
By label
By Actual Value
By Label - When using the sort_index() method, DataFrame can be sorted by passing
the axis arguments and the sorting order. Row labels are sorted by default in
ascending order.
11. Working with Text Data
Working with string data is made simple by a set of string functions that are part of
Pandas.
Most importantly, these functions ignore (or exclude) missing/NaN values.
Watch each operation now to see how it does
import pandas as pd
# Dataset
data = {
'Maths' :[90, 85, 98, 80, 55, 78],
'Science': [92, 87, 59, 64, 87, 96], 'English': [95, 94, 84, 75, 67, 65]
}
# DataFrame
df = pd.DataFrame(data)
# Display the DataFrame
print("DataFrame = \n",df)
# Display the Sum of Marks in each column
print("\nSum = \n",df.sum())
print("\nCount of non-empty values = \n", df.count())
print("\nMaximum Marks = \n", df.max())
print("\nMinimum Marks = \n", df.min())
print("\nMedian = \n",df.median())
//
import pandas as pd
# Dataset
data = {
'Maths': [90, 85, 98, None, 55, 78],
'Science': [92, 87, 59, None, None, 96],
'English': [95, None, 84, 75, 67, None]
}
# DataFrame
df = pd.DataFrame(data)
# Display the DataFrame
print("DataFrame = \n", df)
# Display the Count of non-empty values in each column
print("\nCount of non-empty values = \n", df.count())
13. Indexing and Selecting Data
In Pandas, selecting specific rows and columns of data from a DataFrame constitutes
indexing.
Selecting all the rows and some of the columns, some of the rows and all the columns,
or a portion of each row and each column is what is referred to as indexing.
Another term for indexing is subset selection.
Pandas now supports three types of Multi-axes indexing