Gujarat Technological University
Gujarat Technological University
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
1. Learning Objectives:
To understand basics of Big Data
To understand various Big Data Tools
3. Contents:
Page no. 1 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
4. Text Book(s):
1) Seema Acharya, Subhashini Chellappan, “ Big Data and Analytics”, Wiley India Pvt.
Ltd.,2015
2) Matei Zaharia, Patrick Wendell, Andy Konwinski, Holden Karau ,“Learning
Spark”,O'Reilly Media,2015
3) Zachary Radtka and Donald Miner,“Hadoop with Python'',O'Reilly Media,2016
(Free ebook is available on the following link)(As on 12-10-2018)
https://www.oreilly.com/programming/free/hadoop-with-python.csp
5. Reference Books:
Web Resources:
a) http://www.bigdatauniversity.com
b) http://www.mongodb.com
c) http://hadoop.apache.org/
Page no. 2 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
II 1 Chapter 4,5
III 1,3 Chapter 6 (Book 1), Chapter 2 (Book 3)
IV 1 Chapter 9,10
V 2 Chapter 1,2 and 3 (For Chapter 2 and 3, only Python, No Java, No Scala)
7. Accomplishment
Student will understand fundamentals of Big Data, Tools and Techniques.
Page no. 3 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
Practical List
Part I: MongoDb
MongoDB Shell Commands / Queries: View all databases, Create new database, Drop
existing database, View current database, Switch over to a given database, db.help(), Display
statistics of a given database, Display current version of MongoDB Server, Display list of
collections in current database, Create Collection, Drop Collection, CRUD operations
(Create, Read, Update, Delete), Insert, Update else insert, save, update, remove, Find,
Dealing with Using NULL Values, Count, Limit, Sort, Skip, Arrays and Array Operations,
Aggregate
Page no. 4 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
Page no. 5 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
Page no. 6 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
1) Write a pig script to load and store “Student data”.(Student file contain Roll no,
Name, Marks and GPA).
a) Filter all the students who are having GPA>5.
b) Display the name of all Students in Uppercase.
c) Group tuples of students based on their GPA.
d) Remove duplicates tuple of Student list.
e) Display first three tuples from “student” relation.
f) Display the names of students in ascending order.
g) Join two relation namely Student and department (Rno,DeptNo,DeptName) based
on the values contain in the roll no column.
h) Merge content of two relation Student and department.
i) Partition a relation based on the GPA’s acquired by students.
Page no. 7 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
2) Load the file menu.csv (Category,Name, Price) and write one Pig script
2. Write a pig latin statement to display the names of all users who have sent emails
and also a list of the people that have sent the email to.
3. Store the result in a file.
Page no. 8 of 9
GUJARAT TECHNOLOGICAL UNIVERSITY
With effective
Syllabus for Master of Computer Applications, 4th Semester from academic
Subject Name: Big Data Tools (BDT) year 2018-19
Subject Code: 4649306
Part V: Hive
Install and configure Apache Hive
SerDe and User Defined Function Creation in Hive using Java
Create database, display list of existing databases, describe database, describe extended
database, alter database properties, to make a given database as current database, drop
database, create managed table, create external table, loading data into a table, working with
collection data types, querying a table using select, querying collection data types, create
static partition and load data into it from original table, static partition creation using alter,
create dynamic partition, load data into dynamic partition, create bucket, create view, query
view, drop view, sub-query, joins, Aggregation, Group By and Having, RC File
Implementation
2. Create a partition table for Customer Schema to reward customer based on their life
time value.
Customer Customers Lifetime value
Id
1001 Jack 25000
1002 Smith 8000
1003 David 12000
1004 John 15000
1005 Scott 12000
1006 Lucy 28000
1007 Ajay 12000
1008 Vinay 30000
1009 Joseph 21000
1010 Joshi 25000
Note: Some of the practicals form the above practical list may have seemingly similar
definitions. For better learning and good practice, it is advised that students do maximum
number of practicals. In the practical examination, the definition asked need not have the
same wordings as given in the practical list. However, the definitions asked in the exams will
be similar to the ones given in the practical list.
Page no. 9 of 9