Pandas Functions (1)
Pandas Functions (1)
They are
used to select rows and columns from a DataFrame, but they differ in how they reference and access
data:
loc:
Stands for "location" and is primarily label-based.
It is used for selecting data by specifying row and column labels or boolean conditions.
iloc:
Stands for "integer location" and is primarily integer-based.
It is used for selecting data by specifying the integer positions of rows and columns.
E.X.
import pandas as pd
result_loc = df.loc['x', 'A'] # Selects the value at row 'x' and column 'A'
result_iloc = df.iloc[0, 0] # Selects the value at the first row and first column
print(result_loc) # Output: 1
print(result_iloc) # Output: 1
Bfill and ffill
bfill and ffill are methods in pandas used for filling missing values in a DataFrame or Series with
values from nearby rows. They are often used in data preprocessing when dealing with missing
data.
bfill stands for "backward fill." It fills missing values with the next valid value from the bottom
(i.e., the next row in the DataFrame). It looks backward to fill gaps.
ffill stands for "forward fill." It fills missing values with the last valid value from the top (i.e., the
previous row in the DataFrame). It looks forward to fill gaps.
Pandas functions
For Data Inspection
1. df.head(n): Display the first n rows of a DataFrame.
2. df.tail(n): Display the last n rows of a DataFrame.
3. df.shape: Get the number of rows and columns in the DataFrame.
4. df.info(): Display information about the DataFrame, including data types and missing
values.
5. df.describe(): Generate descriptive statistics for numeric columns.
Data Visualization:
1. df.plot(): Create basic plots using Matplotlib.
I/O Operations:
pd.read_csv('file.csv'): Read data from a CSV file.
df.to_csv('file.csv'): Write DataFrame to a CSV file.
Similar functions exist for other file formats like Excel, SQL databases, etc.
Statistical Functions:
df.mean(), df.median(), df.std(), etc.: Calculate basic statistics for columns.