0% found this document useful (0 votes)
16 views

Pandas_Interview_Questions_PowerBI_SQL

The document presents a series of Pandas interview questions with real-time scenarios relevant to Power BI and SQL. Each scenario includes a question and a corresponding code solution using Pandas for tasks such as data preprocessing, SQL query replication, dataset merging, time intelligence calculations, and handling large datasets. It also covers automating data transformation from Excel to SQL, demonstrating practical applications of Pandas in data analysis and manipulation.

Uploaded by

verma.anil0509
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Pandas_Interview_Questions_PowerBI_SQL

The document presents a series of Pandas interview questions with real-time scenarios relevant to Power BI and SQL. Each scenario includes a question and a corresponding code solution using Pandas for tasks such as data preprocessing, SQL query replication, dataset merging, time intelligence calculations, and handling large datasets. It also covers automating data transformation from Excel to SQL, demonstrating practical applications of Pandas in data analysis and manipulation.

Uploaded by

verma.anil0509
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Pandas Interview Questions with Real-time Scenarios for Power BI and SQL

1. Scenario: Data Preprocessing before Power BI Import

Q: You're working with a CSV dataset that contains sales data. Some rows have missing values in the ProductName

and Revenue columns. Before importing into Power BI, how would you handle this using Pandas?

A:

import pandas as pd

df = pd.read_csv('sales.csv')

df = df[df['ProductName'].notna()]

df['Revenue'] = df['Revenue'].fillna(0)

2. Scenario: SQL to Pandas Conversion

Q: You have this SQL query:

SELECT Region, SUM(SalesAmount) FROM Sales GROUP BY Region HAVING SUM(SalesAmount) > 10000

How would you replicate it in Pandas?

A:

df_grouped = df.groupby('Region')['SalesAmount'].sum().reset_index()

df_filtered = df_grouped[df_grouped['SalesAmount'] > 10000]

3. Scenario: Merging Datasets like SQL Joins

Q: You have two dataframes: orders and customers. How would you perform a LEFT JOIN in Pandas?

A:

df_joined = pd.merge(orders, customers, how='left', on='CustomerID')

4. Scenario: Time Intelligence like DAX

Q: How would you calculate YoY Sales using Pandas?


A:

df['OrderDate'] = pd.to_datetime(df['OrderDate'])

df['Year'] = df['OrderDate'].dt.year

df['Month'] = df['OrderDate'].dt.month

monthly_sales = df.groupby(['Year', 'Month'])['Sales'].sum().reset_index()

monthly_sales['Sales_PY'] = monthly_sales['Sales'].shift(12)

monthly_sales['YoY_Growth'] = ((monthly_sales['Sales'] - monthly_sales['Sales_PY']) / monthly_sales['Sales_PY']) * 100

5. Scenario: Handling Large Datasets like Fact Tables

Q: You have a large dataset (10 million rows) in CSV format. What Pandas techniques would you use?

A:

chunks = pd.read_csv('large_sales.csv', chunksize=500000)

filtered_chunks = []

for chunk in chunks:

chunk = chunk[chunk['Sales'] > 1000]

filtered_chunks.append(chunk)

df_filtered = pd.concat(filtered_chunks)

6. Scenario: Replace Calculated Columns in Power BI with Pandas

Q: How would you calculate Profit Margin?

A:

df['ProfitMargin'] = (df['Revenue'] - df['Cost']) / df['Revenue']

7. Scenario: Automating Excel/CSV to Power BI Pipeline

Q: How would you clean and transform data from Excel sheets and push to SQL?

A:

all_sheets = pd.read_excel('sales_data.xlsx', sheet_name=None)


combined = pd.concat(all_sheets.values(), ignore_index=True)

combined['Revenue'] = combined['Revenue'].fillna(0)

combined['Date'] = pd.to_datetime(combined['Date'])

from sqlalchemy import create_engine

engine = create_engine('mssql+pyodbc://server/db?driver=SQL+Server')

combined.to_sql('SalesCleaned', con=engine, if_exists='replace')

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy