0% found this document useful (0 votes)

26 views2 pages

Compre FoDS

Uploaded by

Azure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views2 pages

Compre FoDS

Uploaded by

Azure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Birla Institute of Technology & Science - Pilani, Hyderabad Campus

First Semester 2018-2019

CS F320: Foundations of Data Science
Comprehensive Examination
Type: Closed Time: 180 mins Max Marks: 90 Date: 12.12.2018

All parts of the same question should be answered together.

1.a. By making use of conditional entropy definition and product rule of probability prove that
H[x, y] = H[y/x] + H[x]. You need to prove the result in both the cases that x, y are continuous and
discrete random variables. [8 Marks]
1.b. Prove Bayes’ rule, i.e., H(y/x) = H(x/y) – H(x) + H(y), for conditional entropy. [6 Marks]
1.c. Consider two binary variables x and y having the joint distribution given in the following Table.

Evaluate the following quantities (a) H[y|x] (b) H[x|y] (c) H[x, y] [3 Marks]

2.a. Prove that that linear regression problem, solved by minimizing the sum of squares of error, will
always have unique optimal solution. It is fine even if you assume that there is only one independent
variable or feature. [8 Marks]
2.b. Write down the loss function that is optimized for lasso and ridge regression. Illustrate few
advantages of lasso regression over ridge regression. [8 Marks]
2.c. The uniform distribution for a continuous variable x is defined by U(x|a, b) = 1/(b – a) for a ≤ x ≤ b.
Verify that this distribution is normalized, and find expressions for its mean and variance. [3 Marks]

3.a. Principal component analysis, or PCA, is a technique that is widely used for applications such as
dimensionality reduction, lossy data compression, feature extraction, and data visualization. Derive ‘k’
principal components amongst ‘n’ features by presenting the problem as maximum variance formulation.
Prove all steps that are required in this derivation. [8 Marks]

3.b. Given some data in R3 with the corresponding 3 X 3 covariance matrix C with eigenvectors c1, c2, c3
(c1,c2,c3 can be taken to be unit orthonormal vectors) and eigenvalues ɳ1, ɳ2, and ɳ3, with ɳ1 = 3, ɳ2 =
1 and ɳ3 = 0.2. [8 Marks]
1. Define a matrix A € R2X3 that maps the data into a two-dimensional space while preserving as much
variance as possible.
2. Define a matrix B € R3X2 that places the reduced data back into R3 with minimal reconstruction error.
How large is the reconstruction error?
3. Prove that AB is an identity matrix. Why would one expect that intuitively?

4.a. What are Markov, Chebyshev, Chernoff bounds for a random variable X. Illustrate each bound with
suitable examples? [6 Marks]

4.b. Find the optimal solution to the following constrained optimization problem using Legrangian
function. [4 Marks]
maximize 1 – x2 – y2
subject to the constraint x + y – 1 = 0
4.c. Write down the Legrangian function with appropriate constraints for the following optimization
problem: maximize f(x) subject to gj(x) = 0 for j = 1, 2, …, J and hk(x) ≥ 0 for k = 1, 2, …, K.[4 Marks]

5. Assume the following data is given: {22, 12, 61, 57, 30, 1, 32, 37, 37, 68, 42, 11, 25, 7, 8, 16}.
a) Apply data discretization by binning the data into 4 bins using equal-depth and equi-width binning,
respectively.
b) Describe the differences between the two binning methods. Give for each of the binning methods an
example application for which that binning method is the most appropriate.
c) If you know that the data actually represent ages of persons, what kind of binning method would you
then use? [2+2+2 Marks]

6. a. What is the difference between supervised and unsupervised discretization? [3+3 Marks]
b. What's the difference between dimensionality reduction and feature selection?

7. a. What is the data type of each of the following kinds of attributes? [3 Marks]
i. Age
ii. Salary
iii. ZIP code
iv. State of residence
v. Height
vi. Weight
7.b. Identify the following data as OLTP, OLAP and Big Data. Justify your answers [3 Marks]
i. Weekly sales of ice cream in Amul BITS campus.
ii. Customer profiles for fraud detection
iii. Add a book to shopping cart

8. You are given the input which is a list of housing data where each input record contains information
about a single house: (address, city, state, zip, value). The output should be the average house value in
each zip code. Draw the pipeline of how this problem can be solved using map-reduce?
Note: Just show how the input is mapped into (key, value) pairs by the map stage, specify what is the key
and what is the associated value in each pair. [6 Marks]

Lesson Plan Grade 4 Perimeter of Composite Figures
No ratings yet
Lesson Plan Grade 4 Perimeter of Composite Figures
5 pages
DMO 2024 Team and Individual Quiz Mechanics
No ratings yet
DMO 2024 Team and Individual Quiz Mechanics
6 pages
Amazon ML Summer School Previous Year Questions
100% (1)
Amazon ML Summer School Previous Year Questions
12 pages
CSE1703 - Fundamental of Data Science
No ratings yet
CSE1703 - Fundamental of Data Science
6 pages
Roller Deflection
No ratings yet
Roller Deflection
18 pages
Int To DS
No ratings yet
Int To DS
2 pages
Week3 Lecture Notes
No ratings yet
Week3 Lecture Notes
11 pages
Chapter-2 Questionnaire
No ratings yet
Chapter-2 Questionnaire
5 pages
Quiz1 18September2021-Ans
No ratings yet
Quiz1 18September2021-Ans
3 pages
Assignment 1
No ratings yet
Assignment 1
16 pages
Worksheet For Quiz
No ratings yet
Worksheet For Quiz
5 pages
Foundation of Data Science Imp
No ratings yet
Foundation of Data Science Imp
6 pages
Boeing 747 - Aerodynamic Analysis
100% (1)
Boeing 747 - Aerodynamic Analysis
59 pages
Online Test Part 2
No ratings yet
Online Test Part 2
3 pages
Setting Out Notes
No ratings yet
Setting Out Notes
3 pages
4311668368487
No ratings yet
4311668368487
9 pages
DSE (Week 4)
No ratings yet
DSE (Week 4)
3 pages
Ppore & Pfrac
No ratings yet
Ppore & Pfrac
5 pages
P&S Quebank
No ratings yet
P&S Quebank
24 pages
Final Compre - Solutions - Updated FoDS
No ratings yet
Final Compre - Solutions - Updated FoDS
12 pages
3.1 Motion Is Relative
No ratings yet
3.1 Motion Is Relative
3 pages
DAV Practicle File
No ratings yet
DAV Practicle File
28 pages
BDS 2019-20
No ratings yet
BDS 2019-20
5 pages
Question Bank
No ratings yet
Question Bank
7 pages
Department of Electrical Engineering School of Science and Engineering
No ratings yet
Department of Electrical Engineering School of Science and Engineering
10 pages
Nonlinear Modal Analysis of A Full-Scale Aircraft
No ratings yet
Nonlinear Modal Analysis of A Full-Scale Aircraft
11 pages
QB FDS
No ratings yet
QB FDS
5 pages
Sample Paper 3
No ratings yet
Sample Paper 3
4 pages
DA Long Questions (12!11!24)
No ratings yet
DA Long Questions (12!11!24)
10 pages
Safety Stock
No ratings yet
Safety Stock
35 pages
Sessional - 2, April 2023 Machine Learning (DSE 2254), IV Sem, DSE Date: 19/04/2023 Max. Marks: 15
No ratings yet
Sessional - 2, April 2023 Machine Learning (DSE 2254), IV Sem, DSE Date: 19/04/2023 Max. Marks: 15
7 pages
DM Makeup Key
No ratings yet
DM Makeup Key
6 pages
CST322 Data Analytics - No Space
No ratings yet
CST322 Data Analytics - No Space
11 pages
Pressure Buildup Analysis With Wellbore Phase Redistribution
No ratings yet
Pressure Buildup Analysis With Wellbore Phase Redistribution
12 pages
CEG Assessment II
No ratings yet
CEG Assessment II
4 pages
2018-19 Exam
No ratings yet
2018-19 Exam
9 pages
Solution 1
No ratings yet
Solution 1
6 pages
Mid Semester Make-Up Data Mining Second Semester 2019-2020
No ratings yet
Mid Semester Make-Up Data Mining Second Semester 2019-2020
3 pages
Python
No ratings yet
Python
8 pages
Probability and Statistics July 2023
No ratings yet
Probability and Statistics July 2023
8 pages
Compre FoDS
No ratings yet
Compre FoDS
2 pages
Purpose: Defining A Class in Otcl
No ratings yet
Purpose: Defining A Class in Otcl
4 pages
It-3031 (DMDW) - CS End Nov 2023
No ratings yet
It-3031 (DMDW) - CS End Nov 2023
23 pages
Assignment 2 Slot8 TTS3208 Summer
No ratings yet
Assignment 2 Slot8 TTS3208 Summer
11 pages
Question Bank
No ratings yet
Question Bank
6 pages
This PDF Is The Sample PDF Taken From Our Comprehensive Study Material For IIT-JEE Main & Advanced
No ratings yet
This PDF Is The Sample PDF Taken From Our Comprehensive Study Material For IIT-JEE Main & Advanced
13 pages
Mock Paper 2 Statistics
No ratings yet
Mock Paper 2 Statistics
5 pages
Board Exam Paper Section 2
No ratings yet
Board Exam Paper Section 2
31 pages
ADA Assignment - Final - 2022
No ratings yet
ADA Assignment - Final - 2022
6 pages
Exam 2022
No ratings yet
Exam 2022
8 pages
20 Diff Districts II PU Stats Prep QPs 2024.
No ratings yet
20 Diff Districts II PU Stats Prep QPs 2024.
73 pages
Amazon ML Pyq
No ratings yet
Amazon ML Pyq
8 pages
Unit 2 Practice Test: B) Find The Median, Mode, and Standard Deviation of These Numbers. (3 Marks)
No ratings yet
Unit 2 Practice Test: B) Find The Median, Mode, and Standard Deviation of These Numbers. (3 Marks)
5 pages
Mid Semester Make-Up
No ratings yet
Mid Semester Make-Up
3 pages
Reactor in Series: F F FX F F FX F F FX
No ratings yet
Reactor in Series: F F FX F F FX F F FX
4 pages
Compre FoDS
No ratings yet
Compre FoDS
2 pages
Artificial Neural Networks-Unsupervised Learning PDF
No ratings yet
Artificial Neural Networks-Unsupervised Learning PDF
39 pages
CBSE Sample Paper Class 11 Maths Set 2
No ratings yet
CBSE Sample Paper Class 11 Maths Set 2
2 pages
Final Exam AIML2023
No ratings yet
Final Exam AIML2023
3 pages
Data Test PDF
No ratings yet
Data Test PDF
7 pages
Module 5.3 Lateral Loads On Building Frames (Portal and Cantilever Method)
No ratings yet
Module 5.3 Lateral Loads On Building Frames (Portal and Cantilever Method)
11 pages
ECE457 Pattern Recognition Techniques and Algorithms: Answer All Questions
No ratings yet
ECE457 Pattern Recognition Techniques and Algorithms: Answer All Questions
3 pages
Enotes
No ratings yet
Enotes
30 pages
II PU STATISTICSudupi
No ratings yet
II PU STATISTICSudupi
4 pages
MS4610 - Introduction To Data Analytics Final Exam Date: November 24, 2021, Duration: 1 Hour, Max Marks: 75
No ratings yet
MS4610 - Introduction To Data Analytics Final Exam Date: November 24, 2021, Duration: 1 Hour, Max Marks: 75
11 pages
DSP Manual
No ratings yet
DSP Manual
44 pages
Science Stem Lesson
No ratings yet
Science Stem Lesson
25 pages
Chapter 3 FM I
No ratings yet
Chapter 3 FM I
16 pages
Compre FoDS
No ratings yet
Compre FoDS
3 pages
Functional Programming in Python Syllabus
No ratings yet
Functional Programming in Python Syllabus
3 pages
Business Stats Q and A
No ratings yet
Business Stats Q and A
17 pages
Data Compression Report
No ratings yet
Data Compression Report
10 pages
Alligation and Mixture - Lecture
No ratings yet
Alligation and Mixture - Lecture
33 pages
Solved ISRO Scientist or Engineer Mechanical May 2017 Paper With Solutions
No ratings yet
Solved ISRO Scientist or Engineer Mechanical May 2017 Paper With Solutions
26 pages
ES Key
No ratings yet
ES Key
6 pages
Sap Examples
No ratings yet
Sap Examples
6 pages
EE364a Homework 6 Solutions: I 1,..., K I I I
No ratings yet
EE364a Homework 6 Solutions: I 1,..., K I I I
20 pages
Week 1
100% (1)
Week 1
25 pages
CSBS - AD3491 - FDSA - IA 1 - Answer Key
100% (11)
CSBS - AD3491 - FDSA - IA 1 - Answer Key
14 pages
Mental Calculation
No ratings yet
Mental Calculation
54 pages
2nd Assignment
No ratings yet
2nd Assignment
15 pages
ML ES 23-24-II Key
No ratings yet
ML ES 23-24-II Key
4 pages
Final Exam, Data Mining (CEN 871) : Name Surname: Student's ID
No ratings yet
Final Exam, Data Mining (CEN 871) : Name Surname: Student's ID
2 pages
DS Unit 5
No ratings yet
DS Unit 5
27 pages
Written Test Syllabus
No ratings yet
Written Test Syllabus
4 pages
Assignment III
No ratings yet
Assignment III
3 pages
18CSO106T Data Analysis Using Open Source Tool: Question Bank
No ratings yet
18CSO106T Data Analysis Using Open Source Tool: Question Bank
26 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
IGNOU MCA Discrete Mathematics Previous Years Unsolved Papers MCS 212
From Everand
IGNOU MCA Discrete Mathematics Previous Years Unsolved Papers MCS 212
Manish Soni
No ratings yet
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Compre FoDS

Uploaded by

Compre FoDS

Uploaded by

Birla Institute of Technology & Science - Pilani, Hyderabad Campus

First Semester 2018-2019

All parts of the same question should be answered together.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.