0% found this document useful (0 votes)
30 views40 pages

D. Prescriptive Analytics

The document consists of a series of assignments focused on various aspects of analytics in education, including types of analytics, data privacy, performance metrics, and visualization techniques. Each assignment contains multiple-choice questions with provided answers, covering topics such as predictive analytics, educational data mining, and data processing. The assignments aim to assess understanding of analytics concepts and their application in educational settings.

Uploaded by

kirubachinni123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views40 pages

D. Prescriptive Analytics

The document consists of a series of assignments focused on various aspects of analytics in education, including types of analytics, data privacy, performance metrics, and visualization techniques. Each assignment contains multiple-choice questions with provided answers, covering topics such as predictive analytics, educational data mining, and data processing. The assignments aim to assess understanding of analytics concepts and their application in educational settings.

Uploaded by

kirubachinni123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Assignment-1

LAT-2023

Q.1 A teacher decided to conduct extra remedial lectures one week before the final exam for the
students who failed in the mid-semester. What type of analytics is she doing?
a. Descriptive Analytics
b. Diagnostic Analytics
c. Predictive Analytics
d. Prescriptive Analytics

Ans- d

Q. 2 Using the mid-sem performance of the students in the class to anticipate students’
performance in the final exam is______
a. Descriptive Analytics
b. Diagnostic Analytics
c. Predictive Analytics
d. Prescriptive Analytics
Ans- c

Q.3 Which of the following sentence is NOT true about Educational Data Mining?
a. It is the process of getting useful insight from Data.
b. It includes validating a learner model or a hypothesis about learning from learners, and
data.
c. It helps to make recommendations to the learner by analyzing the data.
d. It is only a branch of Artificial intelligence.

Ans- d
Q.4 Which of the below statement is least important for academic analytics?
a. Attendance of teachers
b. Pass percentage of students in a course-X
c. Performance of School A in a city B
d. The graduation rate of students in a particular university.

Ans- b

Q.5 Which of the following sentence is NOT true about Academic analytics?

a. provides support to operational and financial decision-making for the stakeholders


b. The focus is on the business of the institution
c. The focus is on the student’s learning
d. Stake-holders are management or executives of the institute
Ans- c
Q.6 Mid-semester exams are over, and the course teacher plotted the histogram of students'
scores to check how normally the scores are distributed. What kind of analytics is this?
a. Demonstrative analytics
b. Descriptive analytics
c. Predictive analytics
d. Diagnostic analytics
Ans- b
Q.7 Pattern mining technique is used for_____
a. Descriptive Analytics
b. Diagnostic Analytics
c. Predictive Analytics
d. Prescriptive Analytics
Ans-b

Q.8 Which of the following question does not belongs to the Predictive Analytics type
a. What is the average score of all students in the Maths Exam?
b. What will be the performance of a student in the end-semester exam?
c. In which courses will the student enroll in the next semester?
d. What is the average performance of the class over the semester?
Ans- a,d

Q.9 A maths teacher has data about students’ mid-semester scores in his course. He then
correlates the mid-semester scores with their final exam scores. He realizes that students who
failed in mid-semester exams also failed in the final exam. What type of analytics is he doing?
a. Descriptive Analytics
b. Diagnostic Analytics
c. Predictive Analytics
d. Prescriptive Analytics
Ans- b

Q.10 “Scaffolding students to achieve their learning goal” and “Personalization in an Intelligent
Tutoring Systems” are examples of-
a. Descriptive Analytics
b. Diagnostic Analytics
c. Predictive Analytics
d. Prescriptive Analytics
Ans- d
Assignment-2

Q.1 What is/are true about Technology Enhanced Open-Ended Learning Environments?
a. Technology acts as a scaffold.
b. Shift the learning process from student to teacher-centric
c. Shift the learning process from teacher to learner-centric
d. There is no support provided by the tutor at all.
Ans- a,d

Q.2 What information must be provided to participants before conducting research studies?
a. What data is collected?
b. Why and how is it collected?
c. How the data is stored?
d. In which journal will it be published?
Ans- a, b, c

Q.3 As researcher/teacher Which of the following data will be collected in a classroom


environment about students’ performance
a. Students’ mid-sem exam
b. Score in Course- project
c. Assignments completion
d. Students extra co-curricular activity
Ans -a, b, c

Q.4 The stage where raw data is converted into actions/events is called
a. Data pre-processing
b. Data processing
c. Data analysis
d. Data reporting
Ans - a

Q. 5 Consider you are a researcher and wanted to collect data from any learning
environment. What will be the primary steps to protect the ethics and data privacy of
the learner?
a. Get consent from the participant
b. Anonymize the data and classify it
c. Store the data in a secure place
d. Share the data with participants
Ans- a, b, c

Q.6 Which of the following features can be extracted from the MOOC platform-
a. Time spent on Discussion Forum
b. Score in in-video Quizzes
c. Number of videos watched
d. Average session time
Ans- a, b,c,d

Q.7 In MOOC data, what field is used to identify user location?


a. UserID
b. Timestamp
c. IP address
d. Session ID
Ans- c

Q.8 Which of the following is part of Data Preprocessing?


a. Identifying other sources of data required for analysis.
b. Flagging erroneous data.
c. Dealing with missing data.
d. Removing irrelevant attributes.
Ans- b, c, d

Q.9 Which of the following is true about privacy-

a. It is someone’s right to keep the information about themselves secret.


b. It is a basic human right
c. Privacy has social benefits
d. Person privacy can be exploited for the sake of scientific advancement
Ans- a, b, c

Q.10 Arrange the following sentences in the correct order of sequence:


i Analyze data
ii Pre-process data
iii Collect data
iv Get approval from Ethics Committee to conduct research and collect data
v Get participants' consent
a. iv, v, iii, ii, i
b. v, iv, iii, ii, i
c. iv, iii, v, ii, i
d. iii, iv, v, ii, i

Ans: a
Assignment-3

Q.1 What is 5-fold cross-validation?


a. The original sample is randomly partitioned into 5 equal sub-samples. Out of the 5 sub-
samples, a single randomly chosen subsample is retained for testing the model and the
remaining 4 are used as training data.
b. The original data set is randomly split into 20% testing data and 80% training the model.
This process is repeated exactly 5 times, and the average is calculated to obtain an
estimate.
c. The original sample is randomly partitioned into 5 equal sub-samples. Out of the 5 sub-
samples, a single subsample is retained for testing the model and the remaining 4 are
used as training data. This process is repeated 5 times with each sub-sample used exactly
once as testing data. The results are then averaged.
d. The original data set is randomly split into 20% testing data and 80% training the model.

Ans- c

Note- Use the below content to answer Q. 2, Q. 3, and Q. 4

The below table provides the actual values (Ya) and the predicted values (Yp) for students who
obtained marks in the end-semester examination. The values 1 and 0 represent whether the
students scored greater than 50% or less than 50%, respectively.

Roll Ya Yp
no.

1 1 1

2 1 0

3 0 0

4 1 0

5 1 1

6 0 1

7 0 0

8 0 1

9 1 0

10 1 1

11 0 0
12 1 0

13 1 1

14 0 1

15 1 0

Q.2 For the above table calculate Precision Value.


a. 0.4
b. 0.75
c. 0.57
d. 0.25

Ans- c

Q3. For the above table, calculate the value of Recall.

a. 0.25
b. 0.50
c. 0.44
d. 0.80
Ans- c

Q4. For the above table, calculate the value of accuracy.

a. 0.25
b. 0.46
c. 0.75
d. 0.80
Ans- b

Q.5 While conducting studies, Researcher wants to classify engaged and unengaged behavior
from the collected data. He uses two classification techniques namely logistic regression and DT
classifier, the value of characteristic parameters from these two classifiers are as follows,

Classifier Accuracy TPR TNR


1 Logistic 0.68 0.6 0.9
Regression(LR)

2 Decision Tree(DT) 0.69 0.7 0.2

From the above data, we can conclude that:

a. Classifier 1(LR) is better


b. Classier 2 (DT) is better
c. Both will give the same results
d. Data Insufficient

Ans- a

Q.6 Two Raters 1 and 2 rates the student engagement of the students while they are working in a
TELE. The following tables depict the observation made by them

Rater-1 Engaged Rater-1 not Engaged

Rater-2 Engaged 25 10

Rater-2 not Engaged 15 20


Note -
Report your
answer correctly to 2 decimal places
Calculate the value of kappa:
a. 0.28
b. 0.43
c. 0.50
d. 0.17
Ans- a

Q.7 The harmonic mean of Precision and Recall is another performance metric known as
a. Cohen’s Kappa
b. F1-score
c. Sensitivity
d. Specificity
Ans- b

Q.8 ML is not required to be implemented in situations where


a. A personalized learning solution is required
b. Human expertise exists and automation is not required.
c. Rules are difficult to extract
d. Learners’ emotions must be predicted.
Ans- b

Q.9 For the below confusion matrix, the Accuracy value is ______
(Report your answers correctly to two decimal places)
Actual A Actual !A
Predicted A 15 30
Predicted !A 20 10

a. 33%
b. 20%
c. 60%
d. 30%
Ans: a
Q.10 For imbalanced datasets, which of the following performance metric is generally used?
a. Precision
b. F-Score
c. Recall
d. Accuracy

Ans: b
Assignment 4

Q.1 Which of the following is true in a Histogram?


a. Only used for the numerical type of data
b. Can be used for numerical, nominal, or categorical types.
c. Bin size should be equal.
d. Bin size can be unequal.
Ans- a, c

Q.2. Consider the following box plot,

The above box plot represents the distribution of marks in the mid-semester examination.
Consider the dark vertical line denoted by Y lying between 25 and 30 in the box plot.
Then, what is the approximate percentage of students having marks between 27.5 -31?

a. 25%
b. 28%
c. 17%
d. 50%

Ans- d

Q.3 To depict the visualization in Dashboard, the researcher can use different available charts or
plots. For choosing the correct representational chart which of the following statements are
correct about the visualization charts?
a. 3D pie charts do not provide any additional advantage as compared to 2D pie
charts.
b. Histogram is generally used for quantitative data whereas bar charts are good for
qualitative data
c. Histogram is generally used for qualitative data whereas a bar chart is used for
quantitative data
d. Change in values of a variable best represented using a pie chart.
Ans- a, b

Q.4 Dr. Arnab is a faculty at ITM. He wishes to visualize the difference in frequency of actions
performed per week (on LMS) of male and female students across the semester, which
visualization chart will he be using for this?
a. bar chart
b. pie Chart
c. Scatter Plot
d. Heat maps

Ans- a
Q.5 Match the following

Type of graph Used for

1. Bar chart A. Correlation

2. Histogram B. Part to whole

3. Pie Chart C. Comparison

4. Scatter plot D. Distribution

a. 1-B, 2-C, 3-A, 4-D


b. 1-C, 2-D, 3-B, 4-A
c. 1-D, 2-A, 3-C, 4-B
d. 1-C, 2-D, 3-A, 4-B

Ans: b

Q.6 Which of the following statements is correct?

a. The interquartile range is the difference between the 2nd quartile and the
1st quartile.
b. The interquartile range is the difference between the 3rd quartile and the
2nd quartile.
c. The interquartile range is the difference between the 3rd quartile and
the 1st quartile.
d. The median is equal to the 2nd quartile value.
Ans- c, d

Q.7 Which of the following statements are correct?


a. A scatterplot shows the relationship between two variables
b. A Scatter plot is used to show the distribution of numeric value
c. Line chart show variation of a variable only with time
d. Line charts show a variation of a variable with any other parameters

Ans- a, d
Q.8 Which of the following statements are correct for the Bar chart?
a. Bar charts are used to represent the correlation.
b. The bars can be plotted vertically or horizontally.
c. Bar graphs are usually used to represent 'continuous data’.
d. Bar graphs are usually used to represent 'categorical data'

Ans: b, d

Q.9 What are the considerations you need to make when using a visualization technique?
a. What is the information you want to convey?
b. The kind of data you have
c. The opinion of the participants in your study
d. Easy to interpret
Ans- a, b, d
Q.10 In the below figure, what is the relation between the two variables?

a. Linear
b. Negative
c. Positive
d. Non-linear
Ans- a,c
Assignment 5

1. Which of the following is NOT true about correlation coefficient(r)?


a. High correlation implies causation.
b. The sign of the r indicates the direction of the association.
c. r varies between -1 and 1.
d. It is the measure of the strength of linear association between two numerical
variables.
Ans: High correlation implies causation.

2. What is iSAT?
a. An interactive visual representation to highlight transitions in data.
b. A learning environment for teaching engineering estimation problems.
c. An algorithm for process mining
d. A pattern mining tool
Ans: An interactive visual representation to highlight transitions in data.

3. For correlation between variables, which of the following visualizations would suit?
a. Bar chart
b. Histogram
c. Scatter plot
d. Pie chart
Ans: Scatter plot

4. The class teacher of class 8 in school wants to find the correlation between the marks
scored in maths and physics in the mid-term examination. The marks in respective
subjects are given in the following table.
Student ID Maths Physics

1 35 30

2 23 33

3 47 45

4 17 23

5 10 8

6 43 49

7 9 12

8 6 4

9 28 31
Find the Spearman Rank Correlation Coefficient up to one decimal place.
a. 0.6
b. 0.4
c. 0.9
d. 0.3
Ans: 0.9

5. Choose the correct match between the given two columns from the options given below

Column A Column B
1. Descriptive Analytics i. What will happen
2. Diagnostic Analytics ii. How to make something happen
3. Predictive Analytics iii. Why did something happen
4. Prescriptive Analytics iv. What happened

a. 1-iv, 2-iii, 3-i, 4-ii


b. 1-i, 2-ii, 3-iii, 4-iv
c. 1-ii, 2-iii, 3-iv, 4-i
d. 1-iv, 2-i, 3-ii, 4-iii

Ans: 1-iv, 2-iii, 3-i, 4-ii

6. The strength of association between an independent variable X and a dependent variable


Y is measured by:
a. Standard deviation
b. Variance
c. Correlation coefficient
d. Interquartile range
Ans: Correlation coefficient

7. The correlation coefficient, r = -0.5 would indicate a scatter plot in which


a. The slope is upward and half of the points of the scatterplot sit on a straight line.
b. The slope is downward and half of the points of the scatterplot sit on a straight
line.
c. The slope is downward and all of the points sit perfectly on a straight line
d. The slope is downward and there is a moderately good fit between the straight
line and the points on the scatterplot
Ans: The slope is downward and there is a moderately good fit between the straight line
and the points on the scatterplot

8. (Use the following data for the next three questions)


The interaction of a learner with a tutoring system is as follows:
read, video, quiz, read, read, quiz, video, video, quiz, video, read, read, quiz, read, video,
video, read, quiz

Which of the following behaviors has the highest state transition values?

a. Read->Read and Read->Quiz


b. Read->Quiz and Quiz->Read
c. Vide->Quiz and Quiz->Video
d. Quiz->Read and Quiz->Video
Ans: Quiz->Read and Quiz->Video

9. The state transition value (up to two decimal places) of Read->Read is:
a. 0.20
b. 0.28 to 0.29
c. 0.33
d. 0.50
Ans: 0.28 to 0.29

10. The state transition value (up to two decimal places) of Video->Video is:
a. 0.5
b. 0.33
c. 0.43
d. 0.9
Ans: 0.33
Assignment 6

1. Which of the following is not an algorithm for developing a process model?


a. ProM
b. Alpha miner
c. Heuristic Miner
d. Fuzzy Miner
Ans: ProM

2. ________ can be used for differentiating interaction behavior between two groups.
a. Differential Sequence Mining
b. Sequential Pattern Mining
c. Naive Bayes Classifier
d. Pruning
Ans: Differential Sequence Mining

3. What is meant by sequence support?


a. It indicates the frequency of occurrence of a pattern for each student
b. It indicates the number of students having a particular pattern
c. It indicates the number of students not having a particular pattern
d. It indicates the total number of patterns observed for a particular student
Ans: It indicates the number of students having a particular pattern

4. What is meant by instance support?


a. It indicates the frequency of occurrence of a pattern for each student
b. It indicates the number of students having a particular pattern
c. It indicates the number of students not having a particular pattern
d. It indicates the total number of patterns observed for a particular student
Ans: It indicates the frequency of occurrence of a pattern for each student

5. ______ support captures the number of individual action sequences for a group where
that sequence of actions occurs at least once.
a. Sequence
b. Instance
c. Frequency
d. Significance
Ans: Sequence

Answer the following five questions about this information. We have three students with the
following sequences of actions in a learning environment:

S1: Video → Forum→ video→ Quiz→ Read → Read → Quiz → Forum → Read → Video
S2: Forum → Video→ Read → Quiz→ Read → Forum → Read → Video
S3: Video→ Quiz→ Read → Forum→ Quiz→ Read → Video → Video → Read → Quiz

6. What is the i-freq mean in the sequence Quiz → Read? (correct to 2 decimal places)
a. 2.3
b. 1.3
c. 0.3
d. 3.3

Ans: 1.3

7. What is the s-support of the sequence Quiz → Read? (correct to 2 decimal places)
a. 1
b. 0.5
c. 1.5
d. 2
Ans: 1

8. What is the significance of action “Read” for Student 2? (correct to 2 decimal places)
a. 0
b. 2
c. 0.66
d. 1
Ans: 1

9. What is the state transition probability for the sequence Video→Quiz for Student 2?
a. 0
b. 0.33
c. 0.66
d. 1
Ans: 0

10. What is the significance of action “Quiz” for student 1? (correct to 2 decimal places)
a. 0.33
b. 0.66
c. 0.99
d. 1
Ans: 0.66
Assignment 7

1. In the given Dendrogram, how many clusters will be obtained if we chose the line across
0.6 to obtain the clusters?

a. 2
b. 3
c. 4
d. 6
Ans: 4

2. What is the optimum value of K chosen from the below figure?

a. 2
b. 4
c. 6
d. 14
Ans: 6

3. Choose the hierarchical clustering method(s) from the options given below.
a. Agglomerative nesting
b. K-means clustering
c. Divisive analysis
d. Probabilistic clustering
Ans: Agglomerative nesting and divisive analysis

4. What is/are true about the AGNES method of clustering?


a. Top-Down approach
b. Each data point is considered as an individual cluster at the beginning of the first
iteration.
c. All the data points are considered as a single cluster at the beginning of the first
iteration.
d. Bottom-up approach.
Ans: Each data point is considered as an individual cluster at the beginning of the first
iteration and the bottom-up approach.

Answer the following four questions based on the information given below:

Soniya is a teacher in a private school teaching Mathematics and Science to the students
of Class 2. There are 5 students in her class. She assessed the performance of the students on a
5-point Likert scale for both courses. She wanted to group the students into two groups and adapt
different teaching methods between them. To identify the groups she used the K-Means
clustering technique for 2 iterations. The distribution of the Likert score of the students in both
the courses and the initial centroids assumed by the teacher are shown in the below figure.

Accordingly, the 5 data points are A1(2,5), A2(5,3), A3(3,1), A4(1,5), and A5(4,4), and
the two centroids are C1(4,2), and C2(3,5). Apply the procedure of K-means clustering up to two
iterations. Calculations should be made upto two decimal points.
2 2
𝑈𝑠𝑒 𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑆𝑞𝑢𝑎𝑟𝑒 = |(𝑋2 − 𝑋1) + (𝑌2 − 𝑌1) |

5. What are the new centroid points after the first iteration?
a. (4, 4) and (3.33, 2.67)
b. (4, 2) and (2.33, 4.67)
c. (2.33, 4) and (3.33, 2.67)
d. (2.33, 3.67) and (4, 2)
Ans: (4, 2) and (2.33, 4.67)

6. What is/are the data point(s) clustered with centroid C1 after the first iteration?
a. A1
b. A2
c. A3
d. A4
e. A5
Ans: A2 and A3

7. What are the new centroid points after the second iteration?
a. (4.33, 3.33) and (3.33, 2.67)
b. (4, 2) and (2, 4.67)
c. (1.33, 3) and (2.33, 1.67)
d. (4,2) and(2.33,4.67)
Ans: (4,2), (2.33,4.67)

8. What is/are the data point(s) clustered with centroid C1 after the second iteration?

a. A1
b. A2
c. A3
d. A4
e. A5
Ans: A2, A3

9. In K-means clustering, the value of the error function is minimum when


a. the data points are grouped in two clusters.
b. the data points are grouped into three clusters.
c. the data points are grouped under one cluster.
d. the number of clusters and the number of data points are equal.
Ans: the number of clusters and the number of data points are equal.

10. Which is needed by K-means clustering?


a. defined distance metric
b. number of clusters
c. initial guess as to cluster centroids
d. A few points are fixed initially to a particular centroid throughout various iterations.
Ans: defined distance metric, number of clusters, iinitial guess as to cluster centroids
Assignment 8

(1) Which of the following is used for prediction?


a. K-means clustering
b. Sequence pattern mining
c. Multivariate regression
d. Spearman’s correlation
Ans: Multivariate regression

The following below is a table showing the Gate score and GPA of students. Use this table for next 5
questions

S Gate GPA
ID Score

1 34 2.4

2 36 2.52

3 60 2.54

4 87 3.28

5 72 3.28

6 73 3.28

7 76 3.4

8 80 3.41

9 81 3.41

10 92 3.76

By using the linear regression model, the best-fit line is obtained as


Y= 1.63+0.02x

(2) Then find the value of the Y-intercept:


a. 1.63
b. 0.02
c. 1.65
d. 0
Ans: 1

(3) What will be the sign of the correlation coefficient (Gate Score and GPA)?
a. Positive
b. Negative
c. Cannot be determined.
d. Zero
Ans: 1

(4) Find the GPA obtained by a student having a Gate score of 80 (as per model).
a. 3.4
b. 3.41
c. 3.23
d. 3.33
Ans: 3

(5) Find the value of the mean squared error value


a. 0.14
b. 4.01
c. 0.4
d. 0.04
Ans: 4

(6) Find the mean of the predicted value


a. 2.9
b. 3.01
c. 3.3
d. 2.8
Ans: 2

(7) Predictive analytics in the education domain is helpful


a. To the owners of the institutions to predict the rank of their institutions among the
group of competing institutions
b. To the learning content developers to provide adaptive content to the
learners
c. To the instructors to adapt their teaching strategies based on the student's
performance
d. To the learners to provide feedback and suggestions to avoid failure in the
learning goal

(8) Regression is generally used


a. To predict the performance of the learners
b. To evaluate the strength of the predictor model developed
c. To investigate the relationship between dependent and independent
variables.
d. To exclusively perform cause and effect analysis

(9) Choose the correct option(s) that best describe(s) the different types of regression
analysis.
a. Multiple regression analysis has multiple dependent variables and multiple
independent variables
b. The multivariate regression analysis has multiple dependent variables and
multiple independent variables
c. Multiple regression analysis has one dependent variable and multiple
independent variables
d. The multivariate regression analysis has one dependent variable and multiple
independent variables

(10) Which of the following is/are true about logistic regression?


a. Logistic regression is suitable to predict categorical values.
b. Logistic regression is suitable to predict continuous values.
c. Logistic regression uses maximum likelihood as the error-minimizing method.
d. Logistic regression uses the least mean squared error as the error-minimizing
method.
Ans: Logistic regression is suitable to predict categorical values and Logistic regression
uses maximum likelihood as the error-minimizing method.
Assignment 9

Q.1 Which of the following statements are correct about decision trees?
A. It requires normalization of data
B. It does not require normalization of data
C. Missing Value are not important
D. Decision tree does not need root node always
Ans-
B. It does not require normalization of data
C. Missing Value are not important

Q.2 Consider the following statements-


Statement 1: Naive Bayes assumes independence among predictors.
Statement 2: Naive Bayes can perform multi-class prediction.
Select correct option about following statements:
A. Both, statement 1 and 2 are correct
B. Statement 1 is correct and statement 2 is wrong
C. Statement 1 is wrong and statement 2 is correct
D. Both, statement 1 and 2 are wrong

Ans- A. Both, statement 1 and 2 are correct

Q.3 Consider data given in the following table.

Attendance in Passed in Exam


(%)
Yes No

40-60 2 3

61-70 2 1

71-80 1 0

Total 5 4
Apply Naive Bayes classifier formula and answer Que 3.1 & 3.2

Q. 3.1 What is the probability of student failing the exam if the attendance is 40-60

A. 3/5

B. 4/5

C. 2/5

D. 1/5

Ans: 3/5

Q. 3.2 What is the probability of student passing the exam if the attendance is 70-80

A. 1
B. 1/2
C. 2/3
D. 1/3

Ans: 1

Q. 4 When to stop further constructing a decision tree?


A. When there are no more conditions left
B. When all the conditions belong to one group
C. When most of the conditions are grouped leaving few ungrouped
D. When there are odd number of nodes remaining

Ans:
A. When there are no more conditions left
B. When all the conditions belong to one group

Q. 5 Why is the Naive Bayes classifier called ‘Naive’?

A. The classifier can solve only a very limited number of problems, under multiple conditions.
B. Its use is limited to the domains of Natural Language Processing and Learning Analytics.
C. It assumes that the features of input space are strongly independent.
D. It assumes that the features of input space are strongly dependent.
Ans- C. It assumes that the features of input space are strongly independent.

Q. 6 Decision tree is a non-linear classifier.


1. True
2. False
Ans: True

Q. 7 Overfitting and increase in the tree complexity can be overcome through the process called
as _________________.
a. Normalization
b. Branching
c. Pruning
d. Classification

Ans: c. Pruning

Q. 8 Which of the following is an advantage of Decision Tree algorithm?

A. It is an extremely fast algorithm


B. It is easily interpretable and explainable
C. It can be used for classification, clustering as well as regression analysis.
D. It can also be used for sequence mining

Ans- B. It is easily interpretable and explainable

Q. 9 Suppose you have given the following graph which shows the ROC curve for two different
classification algorithms such as Random Forest(blue), Logistic Regression(orange) and
KNN(green). Which of the following algorithms would you consider in your final model
building on the basis of performance?
a. Random Forest
b. Logistic Regression
c. KNN
d. None of the above
Ans: a. Random Forest
Assignment 10

Q1. Which of the following is/are NOT the application(s) of the NLP?
A. Extraction of information from body posture
B. Automatic Sentence completion
C. Information extraction from the paragraph
D. Summarizing news article in three lines
E. Summarizing the five different action done by learner while learning
F. Inferring sentiment from the social media post

Ans-
A. Extraction of information from body posture
E. Summarizing the five different action done by learner while learning

Q2. In the following sentence


“Shyaam went to the football field today, while his younger brother went to play
badminton. Both went and came back”
What is the probability of P(to/went)=?

A. 3/2
B. 2/3
C. 1
D. 3/4

Ans: B. 2/3

Q3. What is the minimum edit distance between INTENTION and INTUITION?

a. 4
b. 2
c. 3
d. 1

Ans: B. 2
Q4. Select the correct set of operations which are required to calculate the minimum
edit distance between two words

A. Insert, Delete, Create


B. Substitute, Delete, Create
C. Insert, Delete, Substitute
D. Insert, Delete, Read

Ans: C. Insert, Delete, Substitute

Q5. Which of the following is a tool for NLP applications developed by a team of
Stanford University?

a. NLPrun
b. CoreNLP
c. InstaText
d. Semantica

Ans: b. CoreNLP

Q6. The following pairs of words are stemmed to the same form by the Porter stemmer
(a stemming algorithm). Which of the following pairs would you argue shouldn't be
combined or compared by stemming?

a. abandon/abandonment
b. absorbency/absorbent
c. marketing/markets
d. university/universe
e. volume/volumes

Ans: d. university/universe

Q7. Which of the below word can be obtained by minimum edits (If each operation has
cost 1) from the word "Affect"

A. Effect
B. Defect
C. Perfect
D. Suspect

Ans: A. Effect

Q8. Given the following dictionary of words-

{1: Sun, 2: Moon, 3: Stars, 4: east, 5: west, 6: north, 7: south, 8: rises, 9: sets, 10: and, 11:
the, 12: in }

Which of the following sentences does the index {1, 9, 12, 11, 4} represent?

a. Sun rises in the east.


b. Sun sets in the west.
c. Sun rises in the west.
d. Sun sets in the east.

Ans: d. Sun sets in the east.

Q9. Which of the following techniques can be used for the purpose of converting a
keyword into its base form?

1. Lemmatization
2. Levenshtein
3. Stemming
4. Soundex

Ans: Lemmatization

Stemming

Q10. Statement1: Ram loves to watch movies.


Statement2: Dev loves to watch cricket matches.

What will be the length of the document vector?


a. 11
b. 7
c. 9
d. 8
Ans: d. 8
Assignment 11

Q1. Multimodal learning analytics is


A. A Method of collecting data from multiple channels and present it to user
B. A Way of analyzing data using different machine learning techniques and AI tools
to ensure low errors in sophisticated manner
C. A set of techniques that can be used to collect multiple sources of data in
high frequency, synchronize and code the data, and examine learning in
realistic, ecologically valid, social, mixed-media learning environments.
D. A methodology in which a user has access to his own data through multiple
modes and he/she can modify and analyze his/her own data for improvement.

Ans- C. A set of techniques that can be used to collect multiple sources of data in
high frequency, synchronize and code the data, and examine learning in realistic,
ecologically valid, social, mixed-media learning environments.

Q2. According to Paulo Ekman’s cross-cultural research in 1969, Which of the following
is/are NOT basic emotions:

A. Anger
B. Disgust
C. Boredom
D. Happiness
E. Confusion

Ans: D. Boredom , F. Confusion

Q3. According to facial action coding systems described in iMotions blog, identify the best
emotion that suits the given sequence of facial expressions and corresponding action units.

Facial Expression
Image
Facial Expression Inner Brow Raiser Brow Lowerer Lip Corner
Description Depressor
[ref: https://imotions.com/blog/facial-action-coding-system/]

a. Sadness
b. Surprise
c. Fear
d. Anger
e. Disgust

Ans - a. Sadness

Q4. What do the red, yellow, and green colors in the eye tracking heatmap represent? (
Refer below figure)

[ref: Pretorius, Marco & Calitz, André. (2011). The Contribution of Eye Tracking to Brand Awareness Studies.]
a. Equal amount of gaze points directed towards parts of the image
b. Descending order of the amount of gaze points that were directed
towards parts of the image.
c. Ascending order of the amount of gaze points that were directed towards
parts of the image.
d. Random trace data
Ans- b. Descending order of the amount of gaze points that were directed towards
parts of the image.

Q5. Select the correct sequence of actions which are needed to infer the affective state

A. Detect action units from database -> predict the affective state -> detect face
from video frames
B. Predict the affective state -> detect face from video frames -> Detect action units
from database
C. Detect face from video frames -> Detect action units from database ->
Predict the affective state
D. Detect face from video frames -> Predict the affective state -> Detect action units
from database

Ans: C. Detect face from video frames -> Detect action units from database ->
Predict the affective state

Q6. Select which is NOT the challenge in human observations & self reporting for
detecting affective states.

A. It's time consuming


B. Authenticity of self-reported data is questionable
C. Accuracy will not be good always
D. Human observation and self reporting is very costly process
E. Interobserver reliability score will be low most of the time

Ans: D. Human observation and self reporting is very costly process

Q7. Match the following with correct pairs:

1. Skin Conductance a. EEG signal


2. Facial expressions b. Microphone
3. Brain waves c. GSR
4. Think aloud d. Emotion

a. 1-c, 2-a, 3-d, 4-b


b. 1-b, 2-d, 3-a, 4-c
c. 1-d, 2-a, 3-b, 4-c
d. 1-c, 2-d, 3-a, 4-b
e. 1-d, 2-c, 3-b, 4-a
Ans: d. 1-c, 2-d, 3-a, 4-b

Q8. Which of the following data can be collected using an eye tracker?
a. Eye fixations
b. Saccadic eye movements
c. Raising and lowering of the eyebrow
d. Raising and lowering of eyelids
e. Pupil dilation

Ans: a. Eye fixations,


b. Saccadic eye movements,
e. Pupil dilation

Q9. Ajith, a grade 8 boy, was attending a biology class on the topic of human cells. Since the
concepts taught were abstract in nature, he could not connect to the teaching throughout the
class although he showed keen interest in the beginning. What could be the possible sequence
of his affective state according to the observed model of affective dynamics from D’Mello et al.,
2012.

a. Engaged - Frustrated - Confused - Bored


b. Engaged - Confused - Frustrated - Bored
c. Engaged - Bored - Confused - Frustrated
d. Engaged - Confused - Bored - Frustrated
e. Engaged - Bored - Frustrated - Confused

Ans: b. Engaged - Confused - Frustrated - Bored


Q10. In a research study, non-verbal cues were analyzed to study the emotional engagements,
behavioral engagements and cognitive engagements of the learners in a classroom. The given
figure shows the image captured from the video feed during the study.

In the study, the bounding boxes were formed around the individuals and then analyzed to find
their emotional status. Choose the option(s) which could be considered to make bounding
boxes.
Note: faces are blurred for now. Consider as a researcher you have clear and un-anonymized
video data.

a. Face
b. Body posture
c. Hand gesture
d. Eye Pupils
e. Micro Facial features

Ans: a. Face,
b. Body posture,
c. Hand gesture
Assignment 12

Q1.
Which curve in the figure shown below represents best ROC (Receiver Characteristic Curve):

A. I
B. II
C. III

Ans B. II

Q2. The research Ethics forms an important part of the research. Which of the following is/are
false in light of research ethics?

A. Researchers cannot store the data beyond 5 years.


B. Students who have given consent can withdraw it afterwards.
C. Students can prevent others from participating in the study.
D. The consent form should inform the risk involved in the study.

Ans: C. Students can prevent others from participating in the study.


Q3. The following picture suggests which type of Data Analytics?

A. Descriptive
B. Diagnostic
C. Prescriptive
D. Predictive

Ans: C. Prescriptive

Q4. Which of the following sources were mentioned for obtaining data in the course:
a. DataShop
b. DataStop
c. Kaggle
d. Statsistify
Ans: a. DataShop,
c. Kaggle

Q5. Select the research areas from the following list which does NOT belongs to LA research
domain-
a. Modeling learners’ engagement
b. Privacy and ethics in Share market
c. Affective computing
d. Increasing the sales of e-courses
e. Resource allocation in academic institutions

Ans: b. Privacy and ethics in Share market


d. Increasing the sales of e-courses
e. Resource allocation in academic institutions

Q6. Given below are the names of some conferences and their abbreviations. Choose the one
where the expansions(full form) are correct.
a. LAK - Learning Analytical Knowledge
b. LAK - Learning Analytics and Knowledge
c. AIED- Artificial Intelligence in Education
d. ICCE - International Conference on Computers in Education
e. ICCE - International Conference for Computing Education
Ans: b. LAK - Learning Analytics and Knowledge
c. AIED- Artificial Intelligence in Education
d. ICCE - International Conference on Computers in Education

Q7. Which of the following features can be extracted from MOOC platform-
a. Time spent on Discussion Forum.
b. Number of videos watched.
c. Their planning strategies.
d. Average session time.
e. Number of times a particular video is played and paused.

Ans: a. Time spent on Discussion Forum.


b. Number of videos watched.
d. Average session time.
e. Number of times a particular video is played and paused.

Q8. Which of the following feature tools have been mentioned in the course that are used in
industry-
a. H2O.ai
b. DataRobot
c. DataBot
d. Transform.ai

Ans: a. H2O.ai
b. DataRobot

Q9. Select the correct option about following statements-

Statement 1: One can use Python and R programming language to perform learning
analytics

Statement 2: Plotly is learning analytics tool used for clearing of raw data

A. Statement 1 is correct and statement 2 is wrong


B. Statement 1 is wrong and statement 2 is correct
C. Both statements are correct
D. Both statements are wrong

Ans: A. Statement 1 is correct and statement 2 is wrong

Q10. Select the correct option about following statements-

Statement 1: TensorFlow is ML model builder tool

Statement 2: Featuretools is open source tool for feature engineering and storage

A. Statement 1 is correct and statement 2 is wrong


B. Statement 1 is wrong and statement 2 is correct
C. Both statements are correct
D. Both statements are wrong

Ans: C. Both statements are correct

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy