GE02 (DAVP) VIVA Questionnaire
GE02 (DAVP) VIVA Questionnaire
VIVA Questionnaire
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
(b)
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
(c)
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
(d)
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
(e)
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
(f)
33. Create random integer 1D array of 8 elements and split it into 4 equal sized partitions. Use these partitions to
create two 2x2 arrays namely a1 and a2. Perform floor division between a1 and a2 and store the result into
a3. Scale up the values of a2 by a power of 2 and store the result into a4. Find out the correlation between
arrays a3 and a4.
Questions based on Pandas library
10. If series is an immutable object, then how can we still perform data manipulation operations on it? Justify.
11. If a series object is once created, how can we add or remove elements from the same?
12. Differentiate between reindexing and renaming the indexes of a Series object.
13. If I want to change the size of a series object, then through which concept (i.e. between reindexing and
renaming of indexes) can be used?
14. What do you mean by forward fill and backward fill methods? And, where do we use them?
15. What will be the output of following code snippet: (forward filling & backward filling)
16. What will be the output of following statements: (Arithmetic operations between different sized series
objects)
17. Is it possible to sort a Series object based on its indexes? If yes, then explain how.
18. What do you mean by ranking a series object? What are the various methods available through which we
can rank a series object?
19. What will be the output of following statements:
20. State the method of Pandas library that can be used to:
a. Obtain only unique values present in a Series.
b. Obtain the frequency of each distinct element of a Series.
c. Inspect the membership of a given value in a Series.
22. What does the following statements will generate? (Aliasing v/s copying)
35. State the functions used to read from and write to csv files using dataframe objects of Pandas library.
36. Briefly explain the purpose of following parameters used in the to_csv() method:
a. Sep
b. Index
c. Header
d. Columns
e. Na_rep
37. Briefly explain the purpose of following parameters used in the read_csv() method:
a. Sep
b. Index_col
c. Header
d. Names
e. Skip_rows
38. Write a program using Python that shall read n number of rows from a csv file into a dataframe.
Questions on handling missing values, duplicates & outliers; mapping and binning
39. State the methods of Pandas library that can be used to check whether a series or dataframe contains missing
values or not.
40. Differentiate between isnull() & notnull() methods.
41. What are the various ways to deal with the missing values present in a series or dataframe?
42. How can we delete only those rows of a dataframe which contain only null values?
43. Is it possible to delete only those rows of a dataframe which do not contain even a minimum number of true
values? If yes, explain how?
44. Is it possible to delete those columns from a dataframe which contains any null values?
45. In how many ways, we can specify a fill value to replace the missing values of a dataframe?
46. What do you mean by forward filling & backward filling in context of treating missing values in a
dataframe?
47. How to check for duplicate rows present in a dataframe?
48. How to get rid of duplicate rows present in a dataframe?
49. Differentiate between df.duplicated() and df.drop_duplicates().
50. Differentiate between df.drop_duplicates() and df.drop_duplicates([“A”,”B”]), provided df has four
columns, namely- A, B, C and D.
51. Differentiate between df.drop_duplicates() and df.drop_duplicates(keep=”last”).
52. What will be the output of following statements:
a. print(s1.replace([-999,-1000,-100],0))
b. print(s1.replace([-999,-1000,-100],[9,3,2]))
c. print(s1.replace({-999:0,-1000:1,-100:2}))
53. How does the map() method work? State its one use-case scenario.
54. What do you mean by binning? State one use-case scenario of binning.
55. Differentiate between equal_width and equal_depth binning.
56. Differentiate between cut() and qcut() methods of pandas library.
Questions on merging of data frames & Hierarchical indexing in Series & dataframes
70. What do you mean by split-apply-combine paradigm for series & dataframe objects?
71. What will be the output of following statements for the given dataframe:
grouped_df = df1.groupby(df1["key1"])
a. print(grouped_df["data1"].agg(['sum','mean'))
b. print(grouped_df["data1"].agg([('SUM','sum'),('MEAN','mean')]))
c. print(grouped_df.agg({'data1':['count','sum'], 'data2':'mean'}))
Questions based on Matplotlib
8. While plotting a line graph, explain the different parameters of the plot() method of figure object.
9. What are the various ways to specify a color to the color parameter?
10. What are the various values of the linestyle parameter?
11. What are the various values of the linewidth parameter?
12. What are the various values of the marker parameter?
13. What are the various values of the markersize parameter?
14. How can we add a legend to a graph?
15. “splots[0][0].plot(np.random.rand(10), "r--^", linewidth=2, markersize=4)” --- What does “r--^” indicate?
16. What do you mean by Annotations? How can you add annotations to a graph?
17. What is the purpose of adding shapes to a graph? How can we do so?
18. What is the method that can be used to export a graph from jupyter notebook to a pdf?
19. Which methods are available in Matplotlib and seaborn through which we can plot bar charts for a
series/dataframe objects? (df.plot.bar(), df.plot.barh(), sns.barplot())
20. Differentiate between histogram and density plot.
21. What is KDE?
22. Which methods are available in Matplotlib and seaborn through which we can plot histogram, density and
KDE (kernel density estimation) curves for a series/dataframe objects?
(s1.plot.hist(), df.plot.hist(), df.plot.density(), df.plot.kde(), sns.histplot())
23. What is the purpose of scatter plot? How does a pairplot relate to a scatterplot?
24. Which methods are available in Matplotlib and seaborn through which we can plot scatter plots & pairplot
for a series/dataframe objects? (df.plot.scatter(), sns.paiplot(df))
25. What is a heatmap?
26. Which methods are available in Matplotlib and seaborn through which we can plot heatmap for dataframe
objects? (plt.imshow(df1, cmap=’seismic’))