Flipkart Business Analyst Interview Questions
Flipkart Business Analyst Interview Questions
SQL
1. What are window functions, and how do they differ from aggregate
functions? Can you give a use case?
Explanation:
• Window Functions perform calculations across a set of table rows that are related
to the current row, without collapsing the rows into a single summary result.
• Aggregate Functions, on the other hand, return a single result for a group of rows,
reducing the number of rows in the result set.
Use Case:
If you want to calculate a running total or rank without losing the row-level granularity,
window functions are useful.
Example Query:
Use Case: Calculate the running total of sales for each salesperson.
Schema:
SalesID INT,
SalesPerson VARCHAR(50),
SaleAmount INT,
SaleDate DATE
);
Query:
SELECT
SalesPerson,
SaleAmount,
SaleDate,
FROM Sales;
Result:
• An index improves the speed of data retrieval operations by creating a structure (like
a B-tree) for faster lookups.
• Downside of Indexing:
o They slow down write operations (INSERT, UPDATE, DELETE) because the
index needs to be updated.
• Small Tables: Full table scans are often faster than index lookups.
• Frequent Updates: When the table has frequent write operations, maintaining
indexes adds overhead.
4. Regular Monitoring: Use tools like EXPLAIN or Query Analyzer to ensure indexes are
effective.
Example:
Schema:
PurchaseID INT,
CustomerID INT,
PurchaseDate DATE
);
Query:
WITH RecentPurchases AS (
SELECT CustomerID
FROM Purchases
),
PreviousPurchases AS (
SELECT CustomerID
FROM Purchases
)
SELECT DISTINCT CustomerID
FROM RecentPurchases
Schema:
TransactionID INT,
ProductName VARCHAR(50),
Category VARCHAR(50),
Quantity INT
);
Query:
WITH RankedProducts AS (
SELECT
Category,
ProductName,
Quantity,
FROM Transactions
SELECT
Category,
ProductName,
Quantity
FROM RankedProducts
Result:
Category1 ProductC 20
Category1 ProductA 10
Category1 ProductB 5
Category2 ProductF 25
Category2 ProductD 15
Category2 ProductE 10
5. How would you identify duplicate records in a large dataset, and how
would you remove only the duplicates, retaining the first occurrence?
Example Query:
Schema:
Name VARCHAR(50),
Department VARCHAR(50)
);
FROM Employees
WITH CTE AS (
SELECT
EmployeeID,
Name,
Department,
FROM Employees
)
DELETE FROM Employees
WHERE EmployeeID IN (
SELECT EmployeeID
FROM CTE
);
PYTHON
1. Write a Python function to find the longest consecutive sequence of
unique numbers in a list.
Explanation:
The problem is to find the longest subarray where all the elements are unique and
consecutive. This can be solved using a sliding window technique:
2. Use two pointers (start and end) to expand and contract the window as needed.
3. Update the maximum length of the subarray when the condition is met.
Code:
def longest_consecutive_sequence(nums):
if not nums:
return 0, []
unique_set = set()
start = 0
max_length = 0
longest_seq = []
for end in range(len(nums)):
unique_set.remove(nums[start])
start += 1
unique_set.add(nums[end])
max_length = current_length
longest_seq = nums[start:end + 1]
# Example usage
nums = [1, 2, 3, 1, 4, 5, 6, 2, 7, 8]
Example Output:
mathematica
CopyEdit
Longest Length: 6
1. pandas:
o Provides powerful tools like fillna(), dropna(), and isnull() to handle missing
data effectively.
2. numpy:
3. scikit-learn:
4. pyjanitor (optional):
Examples:
Example Dataset:
import pandas as pd
import numpy as np
data = {
df = pd.DataFrame(data)
df_dropped = df.dropna()
df_filled = df.fillna({
"Name": "Unknown",
"Age": df["Age"].mean(),
"Salary": df["Salary"].median()
})
imputer = SimpleImputer(strategy="mean")
Example Output:
Original Dataset:
Guesstimates
1. Estimate the number of online food delivery orders in a large
metropolitan city over a month:
• Assume population:
o Large metropolitan city population ≈ 10 million.
• Estimate orders:
• Breakdown of issues:
Case Studies
1. A sudden decrease in conversion rate is observed in a popular product
category. How would you investigate the cause and propose solutions?
• Data Analysis:
• Operational Factors:
• Technical Investigation:
• Solutions:
• Revenue Impact:
• CLV Analysis:
• Implementation Feasibility:
Managerial Questions
1. Describe a time when you faced conflicting priorities on a project. How
did you manage your workload to meet deadlines?
Managing conflicting priorities on a project:
• Time management:
• Communicate effectively:
• Promote collaboration:
• Escalate if needed: