0% found this document useful (0 votes)
8 views43 pages

EXL Data Analyst Interview Questions

The document contains a set of interview questions and SQL queries for a Data Analyst position at EXL, with a CTC of 12-14 LPA and 4-5 years of experience. It includes various SQL tasks such as retrieving top revenue-generating products, identifying products above average revenue, analyzing month-over-month spending trends, and marking first and last transactions for users. Additionally, it features Python coding challenges focused on list manipulation and string conversion without using built-in methods.

Uploaded by

sstl16102001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views43 pages

EXL Data Analyst Interview Questions

The document contains a set of interview questions and SQL queries for a Data Analyst position at EXL, with a CTC of 12-14 LPA and 4-5 years of experience. It includes various SQL tasks such as retrieving top revenue-generating products, identifying products above average revenue, analyzing month-over-month spending trends, and marking first and last transactions for users. Additionally, it features Python coding challenges focused on list manipulation and string conversion without using built-in methods.

Uploaded by

sstl16102001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

EXL DATA ANALYST INTERVIEW QUESTIONS

CTC- 12-14 LPA


YOE : 4-5

SQL
Question 1: Write a query to retrieve the top 3 revenue-
generating products within each category.
Input Table: Products

ProductID ProductName Category Revenue

101 Laptop Pro Electronics 150000

102 Gaming PC Electronics 180000

103 Smartphone Electronics 120000

104 Tablet Air Electronics 90000


105 Wireless Earbuds Electronics 60000

201 Office Chair Furniture 80000

202 Dining Table Furniture 110000

203 Bookshelf Furniture 50000

204 Sofa Set Furniture 130000

301 T-Shirt Apparel 30000

302 Jeans Apparel 45000

303 Jacket Apparel 70000

MySQL Query:

SELECT

Category,

ProductName,

Revenue,

ProductRank

FROM (

SELECT

Category,

ProductName,

Revenue,

ROW_NUMBER() OVER (PARTITION BY Category ORDER BY Revenue DESC) AS


ProductRank

FROM

Products

) AS RankedProducts

WHERE
ProductRank <= 3

ORDER BY

Category, ProductRank;

Output Table:

Category ProductName Revenue ProductRank

Apparel Jacket 70000 1

Apparel Jeans 45000 2

Apparel T-Shirt 30000 3

Electronics Gaming PC 180000 1

Electronics Laptop Pro 150000 2

Electronics Smartphone 120000 3

Furniture Sofa Set 130000 1

Furniture Dining Table 110000 2

Furniture Office Chair 80000 3

Question 2: Identify products whose total revenue exceeds the


overall average revenue across all products.
Input Table: Products

ProductID ProductName Revenue

101 Laptop Pro 150000

102 Gaming PC 180000

103 Smartphone 120000


104 Tablet Air 90000

201 Office Chair 80000

202 Dining Table 110000

301 T-Shirt 30000

302 Jeans 45000

MySQL Query:

SELECT

ProductName,

Revenue AS TotalProductRevenue

FROM

Products

WHERE

Revenue > (SELECT AVG(Revenue) FROM Products);

Output Table:

ProductName TotalProductRevenue

Laptop Pro 150000

Gaming PC 180000

Smartphone 120000

Dining Table 110000

Note: The overall average revenue for the given Products table is approx. 100625.

Question 3: Use LAG() and CASE to find customers with a month-


over-month increase in spending.
Input Table: Transactions

TransactionID CustomerID TransactionDate Amount

1 C1 2023-01-10 100

2 C1 2023-01-25 50

3 C2 2023-01-15 200

4 C1 2023-02-05 180

5 C2 2023-02-20 150

6 C3 2023-02-10 300

7 C1 2023-03-01 250

8 C2 2023-03-10 220

9 C3 2023-03-15 280

MySQL Query:

WITH MonthlySpending AS (

SELECT

CustomerID,

DATE_FORMAT(TransactionDate, '%Y-%m') AS TransactionMonth,

SUM(Amount) AS MonthlyTotalSpending

FROM

Transactions

GROUP BY

CustomerID, DATE_FORMAT(TransactionDate, '%Y-%m')

),

LaggedSpending AS (

SELECT
CustomerID,

TransactionMonth,

MonthlyTotalSpending,

LAG(MonthlyTotalSpending, 1, 0) OVER (PARTITION BY CustomerID ORDER BY


TransactionMonth) AS PreviousMonthSpending

FROM

MonthlySpending

SELECT

CustomerID,

TransactionMonth AS Month,

MonthlyTotalSpending AS CurrentMonthSpending,

PreviousMonthSpending,

CASE

WHEN MonthlyTotalSpending > PreviousMonthSpending AND PreviousMonthSpending


> 0 THEN 'Increase'

WHEN MonthlyTotalSpending < PreviousMonthSpending AND PreviousMonthSpending


> 0 THEN 'Decrease'

WHEN MonthlyTotalSpending = PreviousMonthSpending AND PreviousMonthSpending


> 0 THEN 'No Change'

ELSE 'New/First Month'

END AS SpendingTrend

FROM

LaggedSpending

WHERE

MonthlyTotalSpending > PreviousMonthSpending AND PreviousMonthSpending > 0

ORDER BY
CustomerID, TransactionMonth;

Output Table:

CustomerID Month CurrentMonthSpending PreviousMonthSpending SpendingTrend

2023-
C1 250 180 Increase
03

2023-
C2 220 150 Increase
03

Question 4: Write a query to mark the first and last transaction


for every user.
Input Table: UserTransactions

TransactionID UserID TransactionDate Amount

101 U1 2023-01-05 50

102 U2 2023-01-08 120

103 U1 2023-01-15 75

104 U3 2023-01-20 200

105 U2 2023-01-22 90

106 U1 2023-02-01 100

107 U3 2023-02-10 150

108 U2 2023-02-15 110

109 U1 2023-02-25 80

MySQL Query:

SELECT
TransactionID,

UserID,

TransactionDate,

Amount,

CASE WHEN rn_asc = 1 THEN 'Yes' ELSE 'No' END AS IsFirstTransaction,

CASE WHEN rn_desc = 1 THEN 'Yes' ELSE 'No' END AS IsLastTransaction

FROM (

SELECT

TransactionID,

UserID,

TransactionDate,

Amount,

ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY TransactionDate ASC) AS


rn_asc,

ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY TransactionDate DESC) AS


rn_desc

FROM

UserTransactions

) AS RankedTransactions

ORDER BY

UserID, TransactionDate;

Output Table:

TransactionID UserID TransactionDate Amount IsFirstTransaction IsLastTransaction

101 U1 2023-01-05 50 Yes No

103 U1 2023-01-15 75 No No
106 U1 2023-02-01 100 No No

109 U1 2023-02-25 80 No Yes

102 U2 2023-01-08 120 Yes No

105 U2 2023-01-22 90 No No

108 U2 2023-02-15 110 No Yes

104 U3 2023-01-20 200 Yes No

107 U3 2023-02-10 150 No Yes

Question 5: Find employees who share the same manager and


also earn the same salary.
Input Table: Employees

EmployeeID EmployeeName ManagerID Salary

1 Alice 10 60000

2 Bob 10 70000

3 Charlie 11 65000

4 David 10 60000

5 Eve 11 65000

6 Frank 12 80000

7 Grace 10 70000

8 Heidi 11 60000

MySQL Query:

SELECT

e1.EmployeeName AS Employee1,
e2.EmployeeName AS Employee2,

e1.ManagerID,

e1.Salary

FROM

Employees e1

JOIN

Employees e2 ON e1.ManagerID = e2.ManagerID

AND e1.Salary = e2.Salary

AND e1.EmployeeID < e2.EmployeeID -- To avoid duplicate pairs and self-joins

ORDER BY

e1.ManagerID, e1.Salary, e1.EmployeeName;

Output Table:

Employee1 Employee2 ManagerID Salary

Alice David 10 60000

Bob Grace 10 70000

Charlie Eve 11 65000

PYTHON

Question 1: Reverse a list without using built-in methods like


reverse() or slicing.
Approach: We can reverse a list in-place using a two-pointer approach. We'll use two
pointers, left starting at the beginning of the list (index 0) and right starting at the end of the
list (index len(list) - 1). We iterate as long as left is less than right. In each iteration, we swap
the elements at the left and right indices, then increment left and decrement right. This
way, we effectively swap elements from the outer ends inwards until the middle is reached.

Python Code:

def reverse_list_manual(input_list):

"""

Reverses a list in-place without using built-in reverse() or slicing.

Args:

input_list: The list to be reversed.

"""

left = 0

right = len(input_list) - 1

while left < right:

# Swap elements at left and right pointers

input_list[left], input_list[right] = input_list[right], input_list[left]

# Move pointers towards the center

left += 1

right -= 1

return input_list # Return the modified list for demonstration

# Example Usage:

my_list = [1, 2, 3, 4, 5, 6]

print(f"Original list: {my_list}")

reversed_list = reverse_list_manual(my_list)

print(f"Reversed list: {reversed_list}")


my_string_list = ['a', 'b', 'c', 'd']

print(f"Original string list: {my_string_list}")

reversed_string_list = reverse_list_manual(my_string_list)

print(f"Reversed string list: {reversed_string_list}")

empty_list = []

print(f"Original empty list: {empty_list}")

reversed_empty_list = reverse_list_manual(empty_list)

print(f"Reversed empty list: {reversed_empty_list}")

single_element_list = [10]

print(f"Original single element list: {single_element_list}")

reversed_single_element_list = reverse_list_manual(single_element_list)

print(f"Reversed single element list: {reversed_single_element_list}")

Output:

Original list: [1, 2, 3, 4, 5, 6]

Reversed list: [6, 5, 4, 3, 2, 1]

Original string list: ['a', 'b', 'c', 'd']

Reversed string list: ['d', 'c', 'b', 'a']

Original empty list: []

Reversed empty list: []

Original single element list: [10]

Reversed single element list: [10]


Question 2: Convert the string "abc123xyz" to "ABC123XYZ"
using a loop only (no .upper()).
Approach: Characters in Python (and most programming languages) have corresponding
ASCII (or Unicode) values. Lowercase letters ('a' through 'z') have specific ASCII values, and
their uppercase counterparts ('A' through 'Z') are exactly 32 less than their lowercase
equivalents (e.g., ord('a') is 97, ord('A') is 65). We can iterate through each character in the
input string. For each character, we check if it falls within the ASCII range of lowercase
letters. If it does, we convert it to its uppercase equivalent by subtracting 32 from its ASCII
value and then converting it back to a character using chr(). If it's not a lowercase letter, we
keep it as is. We then concatenate all processed characters to form the new string.

Python Code:

def to_uppercase_manual(input_string):

"""

Converts a string to uppercase using a loop and ASCII values,

without using the .upper() method.

Args:

input_string: The string to convert.

Returns:

The uppercase version of the string.

"""

result_string = ""

for char in input_string:

# Check if the character is a lowercase letter (ASCII 'a' to 'z')

if 'a' <= char <= 'z':

# Convert to ASCII value, subtract 32, and convert back to character

uppercase_char = chr(ord(char) - 32)


result_string += uppercase_char

else:

# If not a lowercase letter, keep it as is

result_string += char

return result_string

# Example Usage:

s1 = "abc123xyz"

print(f"Original string: '{s1}'")

converted_s1 = to_uppercase_manual(s1)

print(f"Converted string: '{converted_s1}'")

s2 = "Hello World! 456"

print(f"Original string: '{s2}'")

converted_s2 = to_uppercase_manual(s2)

print(f"Converted string: '{converted_s2}'")

s3 = ""

print(f"Original string: '{s3}'")

converted_s3 = to_uppercase_manual(s3)

print(f"Converted string: '{converted_s3}'")

Output:

Original string: 'abc123xyz'

Converted string: 'ABC123XYZ'

Original string: 'Hello World! 456'


Converted string: 'HELLO WORLD! 456'

Original string: ''

Converted string: ''

Question 3: From a dictionary, print all keys associated with


even-numbered values.
Approach: We iterate through the key-value pairs of the dictionary using the .items()
method. For each pair, we check two conditions:

1. Is the value an integer? This is important to avoid errors if values are strings, floats,
etc.

2. Is the integer value even? We can check this using the modulo operator (%). If value
% 2 == 0, it's an even number. If both conditions are true, we print the corresponding
key.

Python Code:

def print_keys_with_even_values(input_dict):

"""

Prints all keys from a dictionary that are associated with even-numbered values.

Args:

input_dict: The dictionary to process.

"""

print("Keys with even-numbered values:")

found_even_value_key = False

for key, value in input_dict.items():

# Ensure the value is an integer before checking for evenness

if isinstance(value, int) and value % 2 == 0:

print(key)
found_even_value_key = True

if not found_even_value_key:

print("No keys found with even-numbered values.")

# Example Usage:

my_dict1 = {

'apple': 1,

'banana': 2,

'cherry': 3,

'date': 4,

'elderberry': 5,

'fig': 6

print(f"Dictionary 1: {my_dict1}")

print_keys_with_even_values(my_dict1)

print("-" * 30)

my_dict2 = {

'itemA': 100,

'itemB': 201,

'itemC': 300,

'itemD': 403,

'itemE': 500

print(f"Dictionary 2: {my_dict2}")

print_keys_with_even_values(my_dict2)
print("-" * 30)

my_dict3 = {

'name': 'Alice',

'age': 30,

'score': 95,

'year': 2024

print(f"Dictionary 3 (mixed types): {my_dict3}")

print_keys_with_even_values(my_dict3)

print("-" * 30)

my_dict4 = {

'one': 1,

'three': 3,

'five': 'not_a_number'

print(f"Dictionary 4 (no even numbers): {my_dict4}")

print_keys_with_even_values(my_dict4)

print("-" * 30)

empty_dict = {}

print(f"Empty dictionary: {empty_dict}")

print_keys_with_even_values(empty_dict)

print("-" * 30)
Output:

Dictionary 1: {'apple': 1, 'banana': 2, 'cherry': 3, 'date': 4, 'elderberry': 5, 'fig': 6}

Keys with even-numbered values:

banana

date

fig

------------------------------

Dictionary 2: {'itemA': 100, 'itemB': 201, 'itemC': 300, 'itemD': 403, 'itemE': 500}

Keys with even-numbered values:

itemA

itemC

itemE

------------------------------

Dictionary 3 (mixed types): {'name': 'Alice', 'age': 30, 'score': 95, 'year': 2024}

Keys with even-numbered values:

age

year

------------------------------

Dictionary 4 (no even numbers): {'one': 1, 'three': 3, 'five': 'not_a_number'}

Keys with even-numbered values:

No keys found with even-numbered values.

------------------------------

Empty dictionary: {}

Keys with even-numbered values:

No keys found with even-numbered values.

------------------------------
Question 4: Define a function to check if two strings are
anagrams of each other.
Approach: Two strings are anagrams if they contain the same characters with the same
frequencies, regardless of order. A robust way to check this is to:

1. Normalize Strings: Convert both strings to lowercase and remove any non-
alphabetic characters (like spaces, punctuation) to ensure a fair comparison.

2. Frequency Counting: Create a frequency dictionary (or counter) for each


normalized string. This dictionary will store each character as a key and its count as
the value.

3. Compare Frequencies: If the two frequency dictionaries are identical, then the
strings are anagrams.

Python Code:

from collections import Counter

def are_anagrams(str1, str2):

"""

Checks if two strings are anagrams of each other.

Ignores case and non-alphabetic characters.

Args:

str1: The first string.

str2: The second string.

Returns:

True if the strings are anagrams, False otherwise.

"""
# Normalize strings: convert to lowercase and filter out non-alphabetic characters

def normalize_string(s):

return ''.join(sorted(c for c in s.lower() if c.isalpha()))

normalized_str1 = normalize_string(str1)

normalized_str2 = normalize_string(str2)

# Anagrams must have the same length after normalization

if len(normalized_str1) != len(normalized_str2):

return False

# Option 1: Sort the normalized strings and compare (simpler, but potentially less
efficient for very long strings)

# return normalized_str1 == normalized_str2

# Option 2: Use frequency counters (more explicit for "same characters with same
frequencies")

return Counter(normalized_str1) == Counter(normalized_str2)

# Example Usage:

print(f"'listen' and 'silent': {are_anagrams('listen', 'silent')}")

print(f"'Debit card' and 'Bad credit': {are_anagrams('Debit card', 'Bad credit')}")

print(f"'anagram' and 'nagaram': {are_anagrams('anagram', 'nagaram')}")

print(f"'hello' and 'world': {are_anagrams('hello', 'world')}")

print(f"'A gentleman' and 'Elegant man': {are_anagrams('A gentleman', 'Elegant man')}")

print(f"'rail safety' and 'fairy tales': {are_anagrams('rail safety', 'fairy tales')}")

print(f"'test' and 'tess': {are_anagrams('test', 'tess')}")


print(f"'' and '': {are_anagrams('', '')}")

print(f"'a' and 'A': {are_anagrams('a', 'A')}")

print(f"'restful' and 'fluster': {are_anagrams('restful', 'fluster')}")

Output:

'listen' and 'silent': True

'Debit card' and 'Bad credit': True

'anagram' and 'nagaram': True

'hello' and 'world': False

'A gentleman' and 'Elegant man': True

'rail safety' and 'fairy tales': True

'test' and 'tess': False

'' and '': True

'a' and 'A': True

'restful' and 'fluster': True

Question 5: Build a frequency dictionary to count the occurrence


of each character in a string.
Approach: We initialize an empty dictionary, which will store character counts. Then, we
iterate through each character in the input string. For every character encountered:

• If the character is already a key in our dictionary, we increment its corresponding


value (count) by 1.

• If the character is not yet a key, we add it to the dictionary with a value (count) of 1.

Python's collections.Counter provides a highly optimized way to do this, but the question
implies a manual implementation.

Python Code:

def build_frequency_dictionary(input_string):
"""

Builds a frequency dictionary to count the occurrence of each character in a string.

Args:

input_string: The string to analyze.

Returns:

A dictionary where keys are characters and values are their frequencies.

"""

frequency_dict = {}

for char in input_string:

if char in frequency_dict:

frequency_dict[char] += 1

else:

frequency_dict[char] = 1

return frequency_dict

# Example Usage:

s1 = "hello world"

freq_dict1 = build_frequency_dictionary(s1)

print(f"String: '{s1}'")

print(f"Frequency Dictionary: {freq_dict1}")

print("-" * 30)

s2 = "programming"

freq_dict2 = build_frequency_dictionary(s2)
print(f"String: '{s2}'")

print(f"Frequency Dictionary: {freq_dict2}")

print("-" * 30)

s3 = "Python is fun!"

freq_dict3 = build_frequency_dictionary(s3)

print(f"String: '{s3}'")

print(f"Frequency Dictionary: {freq_dict3}")

print("-" * 30)

s4 = ""

freq_dict4 = build_frequency_dictionary(s4)

print(f"String: '{s4}'")

print(f"Frequency Dictionary: {freq_dict4}")

print("-" * 30)

s5 = "aaaaabbbccc"

freq_dict5 = build_frequency_dictionary(s5)

print(f"String: '{s5}'")

print(f"Frequency Dictionary: {freq_dict5}")

print("-" * 30)

Output:

String: 'hello world'

Frequency Dictionary: {'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}

------------------------------
String: 'programming'

Frequency Dictionary: {'p': 1, 'r': 3, 'o': 1, 'g': 2, 'a': 1, 'm': 2, 'i': 1, 'n': 1}

------------------------------

String: 'Python is fun!'

Frequency Dictionary: {'P': 1, 'y': 1, 't': 1, 'h': 1, 'o': 1, 'n': 2, ' ': 2, 'i': 1, 's': 1, 'f': 1, 'u': 1, '!': 1}

------------------------------

String: ''

Frequency Dictionary: {}

------------------------------

String: 'aaaaabbbccc'

Frequency Dictionary: {'a': 5, 'b': 3, 'c': 3}

MS EXCEL
Question 1: Explain the difference between COUNT, COUNTA,
and COUNTBLANK with practical examples.
These three functions are used to count cells in a range, but they differ in what types of
cells they count.

• COUNT: Counts the number of cells in a range that contain numbers. It ignores
blank cells, text, error values, and boolean values.

• COUNTA: Counts the number of cells in a range that are not empty. It counts
numbers, text, logical values (TRUE/FALSE), and error values. It only ignores truly
empty cells.

• COUNTBLANK: Counts the number of empty cells in a range. A cell containing an


empty string ("") from a formula result is considered blank by COUNTBLANK.

Practical Example:

Let's assume you have the following data in cells A1:A7:


Cell Content

A1 100

A2 Hello

A3

A4 50

A5 TRUE

A6 #N/A!

A7 (empty cell)

Formulas and Outputs:

• COUNT:

o Formula: =COUNT(A1:A7)

o Output: 2 (counts A1 and A4, which contain numbers)

• COUNTA:

o Formula: =COUNTA(A1:A7)

o Output: 5 (counts A1, A2, A4, A5, A6 – all non-empty cells)

• COUNTBLANK:

o Formula: =COUNTBLANK(A1:A7)

o Output: 2 (counts A3, which contains an empty string, and A7, which is truly
empty)

Question 2: How do you use INDEX + MATCH to look up a value to


the left of the reference column?
VLOOKUP can only look up values to the right of the lookup column. INDEX and MATCH
together provide a powerful and flexible alternative that can look up in any direction,
including to the left.
• MATCH(lookup_value, lookup_array, [match_type]): This function finds the
position (row number or column number) of a lookup_value within a lookup_array.

• INDEX(array, row_num, [column_num]): This function returns the value at the


intersection of a specified row_num and column_num within a given array.

By nesting MATCH inside INDEX, MATCH determines the row (or column) number, and
INDEX then retrieves the value from that position in a specified column (or row).

Practical Example:

Assume you have the following data in cells A1:C5:

EmployeeID EmployeeName Department

101 Alice HR

102 Bob Sales

103 Charlie IT

104 David Marketing

You want to find the EmployeeName (column B) given an EmployeeID (column A). Here,
EmployeeName is to the right of EmployeeID, but if you wanted to find EmployeeID given
EmployeeName, it would be a left lookup. Let's demonstrate finding EmployeeName
(Column B) for a given EmployeeID (Column A) which is a simple INDEX+MATCH scenario,
and then how to do a "left" lookup for EmployeeID based on Department.

Scenario 1: Find Employee Name (Column B) for Employee ID (Column A) - Standard


Index/Match

• Lookup Value: 103 (in cell E1)

• Desired Output: Charlie

Formula:

=INDEX(B:B, MATCH(E1, A:A, 0))

• Explanation:
o MATCH(E1, A:A, 0): Looks for 103 in column A and returns its position (e.g., 4
if 103 is in A4).

o INDEX(B:B, 4): Retrieves the value from the 4th row of column B.

• Output: Charlie

Scenario 2: Find Employee ID (Column A) for Department (Column C) - "Left" Lookup

• Lookup Value: IT (in cell E2)

• Desired Output: 103

Formula:

=INDEX(A:A, MATCH(E2, C:C, 0))

• Explanation:

o MATCH(E2, C:C, 0): Looks for IT in column C and returns its position (e.g., 4 if
IT is in C4).

o INDEX(A:A, 4): Retrieves the value from the 4th row of column A.

• Output: 103

This clearly demonstrates INDEX and MATCH can retrieve values from any column within
the INDEX range, based on a lookup in any other column within the MATCH range,
regardless of their relative positions.

Question 3: Demonstrate how IFERROR is used in complex


nested formulas.
The IFERROR function allows you to gracefully handle errors that might occur within a
formula. Instead of displaying a standard error message like #N/A!, #DIV/0!, #VALUE!, etc.,
you can specify a custom value or message to display.

• IFERROR(value, value_if_error):

o value: The formula or expression that you want to check for an error.

o value_if_error: The value to return if the value argument evaluates to an error.

Practical Example:
Let's combine IFERROR with VLOOKUP. Assume you have a list of Product IDs and Prices,
and you want to look up a product. If the product ID is not found, VLOOKUP would return
#N/A!. We can use IFERROR to display a more user-friendly message.

Data in A1:B4 (Product Data):

ProductID Price

P001 150

P002 200

P003 120

Scenario:

• In cell D1, you enter P002. In cell E1, you want to find its price.

• In cell D2, you enter P005 (which does not exist). In cell E2, you want to find its price.

Formulas and Outputs:

• Formula in E1 (for D1='P002'):

• =IFERROR(VLOOKUP(D1, A1:B4, 2, FALSE), "Product Not Found")

o Explanation: VLOOKUP(D1, A1:B4, 2, FALSE) successfully finds 'P002' and


returns 200. IFERROR sees no error and returns 200.

o Output: 200

• Formula in E2 (for D2='P005'):

• =IFERROR(VLOOKUP(D2, A1:B4, 2, FALSE), "Product Not Found")

o Explanation: VLOOKUP(D2, A1:B4, 2, FALSE) tries to find 'P005', but it's not
in the list, so it returns #N/A!. IFERROR detects this error and returns the
value_if_error which is "Product Not Found".

o Output: Product Not Found

IFERROR makes your spreadsheets more robust and user-friendly by preventing unsightly
error messages from appearing.
Question 4: Create a formula to highlight duplicate values
excluding their first instance.
This task is typically done using Excel's Conditional Formatting with a custom formula. The
idea is to apply a format (like a fill color) to cells that are duplicates but are not the first
occurrence of that value in the list.

Approach: We will use the COUNTIF function combined with a specific range reference
that expands as the formula is applied down the column.

• COUNTIF(range, criteria): Counts the number of cells within a range that meet the
specified criteria.

Practical Example:

Assume you have a list of names in column A, starting from A1.

Cell Content

A1 Apple

A2 Banana

A3 Apple

A4 Cherry

A5 Banana

A6 Apple

Steps for Conditional Formatting:

1. Select the range you want to apply the formatting to (e.g., A1:A6).

2. Go to the Home tab, click on Conditional Formatting, then New Rule...

3. Select "Use a formula to determine which cells to format".

4. Enter the following formula in the "Format values where this formula is true:" box:

5. =COUNTIF($A$1:A1,A1)>1

o Note on COUNTIF range:


▪ $A$1: The starting point is absolute so it always refers to the top of
your selection.

▪ A1: The ending point is relative to the current cell in the selection.
When Excel applies this formula to A2, it becomes
COUNTIF($A$1:A2,A2). For A3, it's COUNTIF($A$1:A3,A3), and so on.

o Explanation:

▪ For cell A1 (Apple): COUNTIF($A$1:A1,A1) is 1. 1 > 1 is FALSE, so A1 is


not highlighted.

▪ For cell A2 (Banana): COUNTIF($A$1:A2,A2) is 1. 1 > 1 is FALSE, so A2


is not highlighted.

▪ For cell A3 (Apple): COUNTIF($A$1:A3,A3) is 2 (because 'Apple'


appears twice in A1:A3). 2 > 1 is TRUE, so A3 is highlighted.

▪ For cell A5 (Banana): COUNTIF($A$1:A5,A5) is 2. 2 > 1 is TRUE, so A5


is highlighted.

▪ For cell A6 (Apple): COUNTIF($A$1:A6,A6) is 3. 3 > 1 is TRUE, so A6 is


highlighted.

6. Click the Format... button to choose your desired formatting (e.g., a light red fill).

7. Click OK twice.

Output:

Cells A3, A5, and A6 would be highlighted, as they are duplicate occurrences of 'Apple' and
'Banana' after their first instance.

Question 5: Use SUMIFS to calculate total sales where Region =


“West” and Month = “Jan”`.
The SUMIFS function is used to sum values in a range that meet multiple criteria. It's much
more powerful than SUMIF which only handles a single criterion.

• SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...):

o sum_range: The range of cells to sum.

o criteria_range1: The range of cells that contains the first criterion.


o criteria1: The criterion that defines which cells in criteria_range1 will be
summed.

o [criteria_range2, criteria2]: Optional additional ranges and their criteria.

Practical Example:

Assume you have the following sales data in cells A1:C6:

Region Month Sales

North Jan 100

West Jan 150

East Feb 200

West Jan 120

South Mar 80

West Feb 180

Formula:

=SUMIFS(C:C, A:A, "West", B:B, "Jan")

• Explanation:

o C:C: This is the sum_range. Excel will sum values from column C.

o A:A: This is criteria_range1 (the Region column).

o "West": This is criteria1. Excel will only consider rows where the value in
column A is "West".

o B:B: This is criteria_range2 (the Month column).

o "Jan": This is criteria2. Excel will only consider rows where the value in
column B is "Jan".

The function will look for rows where BOTH Region is "West" AND Month is "Jan", and then
sum the corresponding "Sales" values.

Output:
• In the example data, the rows that match both criteria are:

o Row 3: West, Jan, 150

o Row 5: West, Jan, 120

• The SUMIFS function will add 150 + 120.

• Output: 270

Power bi

Question 1: What’s the key difference between a calculated


column and a measure in Power BI?
The distinction between calculated columns and measures is fundamental to Power BI and
DAX (Data Analysis Expressions).

Calculated Column:

• Definition: A new column added to an existing table in your data model. Its values
are computed row by row based on a DAX expression.

• Calculation Time: Values are calculated and stored in the data model at the time of
data refresh.

• Storage: Consumes memory and increases the size of your data model, as the
calculated values are physically stored.

• Context: Calculated in row context. This means the calculation for each row is
independent of filters applied to the report; it only considers the values in that
specific row.

• Usage: Can be used like any other column in your tables – for filtering, slicing,
grouping, or as part of other calculations. They are suitable for segmenting data or
adding new attributes to rows.

• Example Use Case: Creating a FullName column by concatenating FirstName and


LastName, or a ProfitMargin column ([Sales] - [Cost]) / [Sales]) on a transaction
table.
o FullName = [FirstName] & " " & [LastName]

o OrderYear = YEAR([OrderDate])

Measure:

• Definition: A dynamic calculation that aggregates data based on the current filter
context in a report. Measures are not stored physically in the data model.

• Calculation Time: Values are computed on-the-fly when they are used in a visual
(e.g., a table, chart, or card) and respond to interactions (like slicers or cross-
filtering).

• Storage: Does not consume significant memory, as only the DAX expression is
stored, not the results of the calculation.

• Context: Primarily evaluated in filter context. This means the calculation considers
all filters applied to the visual, page, or report (e.g., selections from slicers,
interactions with other visuals).

• Usage: Used for aggregation and performing calculations across multiple rows,
such as sums, averages, counts, ratios, or complex business logic. They are the
backbone of analytical reports.

• Example Use Case: Calculating Total Sales, Average Order Value, or Total Profit.

o Total Sales = SUM(Sales[Amount])

o Average Order Value = AVERAGEX(Sales, Sales[Amount])

Key Differences Summary:

Feature Calculated Column Measure

On-the-fly, aggregated based on filter


Calculation Per row, stored in model
context

Consumes memory (physical Minimal memory (only expression


Storage
storage) stored)

Context Row context Filter context (primarily)

Slicing, filtering, grouping, new Aggregations, KPIs, complex


Usage
attributes calculations
Output Adds a new column to a table Returns a single aggregated value

When to Use Which:

• Use a Calculated Column when you need a new column that adds characteristics
to individual rows, which can then be used for filtering, grouping, or as input for
other calculations.

• Use a Measure when you need to perform aggregations or calculations that change
based on user interactions, filters, or the visual's context. Measures are for
analytical results, not for adding new dimensions to your data.

Question 2: How do you manage many-to-many relationships in


your data model?
A many-to-many relationship occurs when multiple records in one table can relate to
multiple records in another table. For example, a student can enroll in multiple courses,
and a course can have multiple students. Power BI's direct many-to-many relationships
can sometimes lead to ambiguous filter propagation and incorrect results.

The recommended and most robust way to manage many-to-many relationships in Power
BI is by using a bridge table (also known as a "junction table" or "factless fact table").

Approach with a Bridge Table:

1. Identify the two "many" tables: These are the tables that have a many-to-many
relationship (e.g., Students and Courses).

2. Create a Bridge Table: This table acts as an intermediary. It typically contains only
the unique keys from the two "many" tables, representing each unique combination
that connects them.

o Example: For Students and Courses, the bridge table would be Enrollment,
containing StudentID and CourseID as foreign keys.

3. Establish One-to-Many Relationships:

o Create a one-to-many relationship from the first "many" table (e.g.,


Students) to the bridge table (e.g., Enrollment) based on their common key
(StudentID).

o Create another one-to-many relationship from the second "many" table


(e.g., Courses) to the bridge table (e.g., Enrollment) based on their common
key (CourseID).
o Direction: The filter direction should typically be from the "one" side to the
"many" side (i.e., from Students to Enrollment, and from Courses to
Enrollment).

4. Use the Bridge Table for Filtering and Aggregation: When you want to filter one
"many" table based on values from the other, or to aggregate data that spans both,
the filter context flows through the bridge table.

Example Scenario (Students and Courses):

• Table 1: Students

o StudentID (PK)

o StudentName

• Table 2: Courses

o CourseID (PK)

o CourseName

• Bridge Table: Enrollment (Created manually or imported if it exists in source)

o EnrollmentID (PK, optional)

o StudentID (FK to Students)

o CourseID (FK to Courses)

o EnrollmentDate (optional, can be a factless fact table)

Relationships in Power BI:

• Students[StudentID] (1) --- (*) Enrollment[StudentID] (Single cross-filter direction)

• Courses[CourseID] (1) --- (*) Enrollment[CourseID] (Single cross-filter direction)

Why this works: When you select a StudentName from the Students table, the filter
propagates to the Enrollment table, showing only the courses that student is enrolled in.
Similarly, when you select a CourseName from the Courses table, the filter propagates to
the Enrollment table, showing only the students enrolled in that course. Measures can then
correctly aggregate data based on these filters without ambiguity.

This bridge table pattern ensures that relationships are explicit and filter propagation is
clear, leading to accurate calculations and analyses.
Question 3: Write a DAX measure to compute the cumulative sales
(running total) per customer.
To calculate the cumulative sales (running total) per customer, we need to sum sales up to
the current date for each customer independently. This involves using CALCULATE to
modify the filter context and FILTER with ALL to consider all dates up to the current one for
each customer.

Assume you have a Sales table with [CustomerKey], [OrderDate], and [SalesAmount], and
a DimDate table marked as a date table, connected to Sales[OrderDate].

DAX Measure:

Cumulative Sales Per Customer =

CALCULATE (

SUM ( Sales[SalesAmount] ), -- 1. Base measure: Sum of SalesAmount

FILTER ( -- 2. Modify filter context using FILTER

ALL ( DimDate[Date] ), -- a. Clear all date filters from the date table

DimDate[Date] <= MAX ( DimDate[Date] ) -- b. Keep only dates up to the maximum


visible date

),

ALLEXCEPT ( -- 3. Preserve customer context (and other column


contexts)

Sales,

Sales[CustomerKey] -- but clear filters on all other columns except


CustomerKey

Explanation of the DAX Formula:

1. SUM(Sales[SalesAmount]): This is the basic aggregation. It sums up the sales


amount from the Sales table.
2. FILTER(ALL(DimDate[Date]), DimDate[Date] <= MAX(DimDate[Date])): This is the
core of the running total logic.

o ALL(DimDate[Date]): This removes any existing date filters from the


DimDate[Date] column. This is crucial because we want to consider all
dates, not just the one currently in context (e.g., for a specific month or day in
a visual).

o DimDate[Date] <= MAX(DimDate[Date]): This then re-applies a filter.


MAX(DimDate[Date]) gets the latest date currently visible in the filter context
of the current row/cell of the visual. The FILTER expression then includes all
dates from the DimDate table that are less than or equal to this MAX date.
This effectively creates the "running" part of the total.

3. ALLEXCEPT(Sales, Sales[CustomerKey]): This is key for "per customer"


cumulative sales.

o ALLEXCEPT(Sales, Sales[CustomerKey]): This clears all filters from the Sales


table except for the filters applied to the Sales[CustomerKey] column. This
ensures that when the measure is evaluated for Customer A, it only
considers sales for Customer A, but ignores any other columns in the Sales
table (like ProductID, Region, etc.) that might otherwise constrain the sum
unnecessarily for the running total logic. It essentially keeps the
CustomerKey context intact while allowing the FILTER for dates to work
across the entire date range for that specific customer.

How it works in a Visual:

When you place CustomerKey and OrderDate (e.g., by month) in a table visual along with
this measure, for each customer and each month, the measure will calculate the sum of
sales for that customer from their first sale up to the end of that specific month.

Question 4: Describe how to apply dynamic filters using slicers


effectively.
Slicers are interactive visual components in Power BI reports that allow users to filter data
on the report canvas. They provide a user-friendly way to apply dynamic filters, enabling
users to explore data subsets without needing to interact directly with report filters or filter
panes.

How Slicers Work:


1. Selection: When a user selects one or more items in a slicer (e.g., a specific Region
from a Region slicer), that selection creates a filter context.

2. Propagation: This filter context is then propagated throughout the data model,
influencing all visuals on the report page that are connected to the slicer's
underlying data.

3. Cross-Filtering: Visuals connected to the filtered data update dynamically to reflect


the selected criteria.

Steps to Apply Dynamic Filters Using Slicers:

1. Add a Slicer Visual: From the Visualizations pane, select the "Slicer" icon.

2. Drag a Field to the Slicer: Drag the field you want to filter by (e.g., Region, Year,
Product Category) from the Fields pane to the "Field" well of the slicer visual.

3. Configure Slicer Settings (Format Pane):

o Selection:

▪ Multi-select: Allow users to select multiple items (Ctrl+Click or


checkbox mode).

▪ Single Select: Restrict selection to one item at a time (radio buttons).

▪ "Select All" option: Provides a convenient way to clear all filters.

o Orientation: Vertical (default list) or Horizontal (tiles/buttons).

o Search Box: Enable for large lists to quickly find items.

o Header/Items formatting: Adjust font, color, size for aesthetics.

o Responsiveness: Ensure slicers adapt well to different screen sizes.

Managing Interactions Between Multiple Slicers:

When you have multiple slicers on a page, they interact with each other and with other
visuals. Power BI allows you to control these interactions:

1. Default Behavior: By default, all slicers on a page will cross-filter each other and all
other visuals on that page.

2. Edit Interactions:

o Select a slicer (or any visual) on your report page.


o Go to the Format tab in the ribbon.

o Click on "Edit interactions".

o Small icons will appear on all other visuals and slicers:

▪ Filter Icon: (Default) The selected slicer will filter the other visual.

▪ Highlight Icon: The selected slicer will highlight (rather than filter)
data in the other visual (e.g., showing a subset of bars in a bar chart
without removing others).

▪ None Icon: The selected slicer will have no effect on the other visual.

o Click the desired icon on each other visual/slicer to customize the


interaction.

3. Slicer Sync (Sync Slicers Pane):

o To apply a slicer's selection across multiple report pages, use the Sync
Slicers pane (View tab > Show panes > Sync slicers).

o When you select a slicer, the Sync Slicers pane shows which pages it's visible
on and which pages it's synced to.

o You can set a slicer to:

▪ Be visible on specific pages.

▪ Sync its selections across specific pages (meaning if you filter on Page
1, the same filter is applied to Page 2).

o This is highly effective for maintaining consistent filtering experience


throughout a multi-page report.

Effective Use of Slicers:

• Placement: Place slicers prominently where users can easily find and interact with
them (e.g., top, left sidebar).

• Logical Grouping: Group related slicers together.

• Hierarchy Slicers: Use hierarchical slicers (e.g., Year > Quarter > Month) for drill-
down filtering.

• Clear Labels: Ensure slicer headers are clear and indicate what they filter.

• "Select All" Option: Always consider including this for convenience.


• Reset Button: Consider adding a button with a bookmark to reset all slicers to their
default state.

Question 5: How can you design a KPI visual to compare current


vs. previous month’s sales?
A KPI (Key Performance Indicator) visual in Power BI is designed to display a key metric, its
target, and how it's performing over time. To compare current vs. previous month's sales
using a KPI visual, you'll need three main DAX measures: one for total sales (the base
value), one for current month's sales (the indicator), and one for previous month's sales
(the target/goal).

Assume you have a Sales table with a [SalesAmount] column and a DimDate table marked
as a date table, connected to your Sales table on a date column.

Step 1: Create a Base Measure for Total Sales

This is a simple aggregation of your sales amount.

Total Sales = SUM(Sales[SalesAmount])

Step 2: Create a Measure for Current Month's Sales (Indicator)

This measure will show the sales for the latest month selected (or the current month if no
filter is applied).

Current Month Sales =

CALCULATE (

[Total Sales], -- Use the base Total Sales measure

LASTNONBLANK ( -- Find the last non-blank date in the current filter context

DimDate[Date],

[Total Sales] -- Check for sales on that date

),

ALL ( DimDate ) -- Clear all filters from the DimDate table

)
Refined Current Month Sales (more robust if filtering to single month):

A simpler way to ensure "current month" is truly the selected month:

Current Month Sales =

VAR MaxDateSelected = MAX(DimDate[Date])

RETURN

CALCULATE(

[Total Sales],

FILTER(

ALL(DimDate[Date]),

YEAR(DimDate[Date]) = YEAR(MaxDateSelected) && MONTH(DimDate[Date]) =


MONTH(MaxDateSelected)

Step 3: Create a Measure for Previous Month's Sales (Target/Goal)

This measure calculates the sales for the month immediately preceding the current month.

Previous Month Sales =

CALCULATE (

[Total Sales], -- Use the base Total Sales measure

DATEADD ( -- Shift the date context

LASTNONBLANK ( -- Find the last non-blank date of the current context

DimDate[Date],

[Total Sales]

),

-1, -- Move back 1 unit

MONTH -- Unit is Month


)

Alternative Previous Month Sales (if Current Month Sales uses VAR MaxDateSelected):

Previous Month Sales =

VAR MaxDateSelected = MAX(DimDate[Date])

RETURN

CALCULATE(

[Total Sales],

FILTER(

ALL(DimDate[Date]),

DimDate[Date] >= STARTOFMONTH(EDATE(MaxDateSelected, -1)) && DimDate[Date]


<= ENDOFMONTH(EDATE(MaxDateSelected, -1))

(The DATEADD version is generally more robust for time intelligence functions).

Step 4: Design the KPI Visual

1. Add a KPI Visual: From the Visualizations pane, select the "KPI" icon.

2. Assign Fields:

o Indicator: Drag the Current Month Sales measure to the "Indicator" field
well.

o Trend axis: Drag the Date hierarchy (e.g., Year, Month) from your DimDate
table to the "Trend axis" field well. This allows the visual to show the trend of
sales over time.

o Target goals: Drag the Previous Month Sales measure to the "Target goals"
field well.
3. Format the KPI Visual (Format Pane):

o Indicator: Customize font size, color.

o Trend axis: Adjust colors.

o Goal: Enable "Show goals" and customize goal labels and colors.

o Color-coding: Power BI automatically applies color-coding (e.g., green for


good, red for bad) based on whether the indicator is above or below the
target. You can reverse this if a lower value is better.

o Target Goal Formating: Choose if the goal is higher or lower for "good"
performance.

o Display Units: Format the display units for sales amounts (e.g., 'K' for
thousands).

By following these steps, your KPI visual will effectively compare the current month's sales
against the previous month's sales, providing a clear visual representation of performance.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy