0% found this document useful (0 votes)

20 views2 pages

CTBD Sol02

The document discusses a MapReduce solution for counting word lengths in categories. It includes code for a mapper that tokenizes words and assigns a category, and a reducer to sum the counts for each category. It also explains code to generate an inverted index by mapping words to their files and reducing to concatenate the files for each word.

Uploaded by

pthuynh709

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views2 pages

CTBD Sol02

Uploaded by

pthuynh709

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Exercise 2:

Hadoop MapReduce

Concepts and Technologies for Distributed Systems and Big Data Processing – SS 2017

Solution 2 Implementation

You can download the code for the solution for this task from the course website.

Solution 3 Completion

Complete the following code for WordLength, which should count how many words belong to each of the following four
length categories:
tiny: 1 letter — small: 2–4 letters — medium: 5–9 letters — big: more than 10 letters

1 public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {

2 private final static IntWritable one = new IntWritable(1);
3 private Text category = new Text();
4
5 @Override
6 protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
7 StringTokenizer tokenizer = new StringTokenizer(value.toString(), ",;\\. \t\n\r\f");
8 while (tokenizer.hasMoreTokens()) {
9 String word = tokenizer.nextToken();
10
11 int length = word.length();
12 String c = ((length == 1) ? "tiny" :
13 (length >= 2 && length <= 4) ? "small" :
14 (length >= 5 && length <= 9) ? "medium": "big");
15 category.set(c);
16 context.write(category, one);
17
18 }
19 }
20 }
21
22 public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
23 private IntWritable result = new IntWritable();
24
25 @Override
26 protected void reduce(Text key, Iterable<IntWritable> values, Context context)
27 throws IOException, InterruptedException {
28
29 int sum = 0;
30 for (IntWritable val: values) {
31 sum += val.get();
32 }
33 result.set(sum);
34 context.write(key, result);
35
36 }
37 }

1
Solution 4 Comprehension

Understand and explain what the following code does. What is the output of the program for the following input?
file1.txt: Hello World Bye World
file2.txt: Hello Hadoop Goodbye Hadoop

1 public static class TokenizerMapper extends Mapper<LongWritable, Text, Text, Text> {

2 private Text word = new Text();
3 private Text file = new Text();
4
5 @Override
6 protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
7 FileSplit fileSplit = (FileSplit)context.getInputSplit();
8 String fileName = fileSplit.getPath().getName();
9 file.set(fileName);
10
11 StringTokenizer tokenizer = new StringTokenizer(value.toString());
12 while (tokenizer.hasMoreTokens()) {
13 word.set(tokenizer.nextToken());
14 context.write(word, file);
15 }
16 }
17 }
18
19 public static class InvertedReducer extends Reducer<Text, Text, Text, Text> {
20 private Text result = new Text();
21
22 @Override
23 protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
24 StringBuilder sb = new StringBuilder();
25 for (Text val: values) {
26 sb.append(val);
27
28 if(values.iterator().hasNext()) {
29 sb.append(",");
30 }
31 }
32 result.set(sb.toString());
33 context.write(key, result);
34 }
35 }

The code computes the inverted index for the given documents, i.e., a list of references to documents for each word. It
produces the following output:

Bye file01
Goodbye file02
Hadoop file02,file02
Hello file02,file01
World file01,file01

Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
No ratings yet
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
13 pages
Practical 3bcbs
No ratings yet
Practical 3bcbs
5 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
CTBD Ex02
No ratings yet
CTBD Ex02
3 pages
579 BDA Week-04
No ratings yet
579 BDA Week-04
1 page
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
PART 1 - Install Java and Hadoop On Ubuntu
No ratings yet
PART 1 - Install Java and Hadoop On Ubuntu
4 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Cloud LAB 10.1,11.1,12.1
No ratings yet
Cloud LAB 10.1,11.1,12.1
6 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Wordcount
No ratings yet
Wordcount
3 pages
3 MapReduce Program Ex Code
No ratings yet
3 MapReduce Program Ex Code
14 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Experiment 6 BDA
No ratings yet
Experiment 6 BDA
4 pages
Word Count Program
No ratings yet
Word Count Program
2 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
To Count Using Map and Reduce Program: Wordcount - Java
No ratings yet
To Count Using Map and Reduce Program: Wordcount - Java
2 pages
Hadoop Developingapps PDF
No ratings yet
Hadoop Developingapps PDF
17 pages
WordCount Program Hadoop Task 2
No ratings yet
WordCount Program Hadoop Task 2
7 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Run Wordcount
No ratings yet
Run Wordcount
3 pages
Inter BDSD 2022-2023
No ratings yet
Inter BDSD 2022-2023
3 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Sribharanitharan.M 71762234049
No ratings yet
Sribharanitharan.M 71762234049
2 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
Palak
No ratings yet
Palak
10 pages
Map Reduce Program
No ratings yet
Map Reduce Program
2 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
No ratings yet
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
6 pages
MR Progs For Self Excercise
No ratings yet
MR Progs For Self Excercise
14 pages
Codigo Haddop
No ratings yet
Codigo Haddop
3 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
1 Word Count
No ratings yet
1 Word Count
2 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
Exp 11
No ratings yet
Exp 11
4 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Dsa Prac 5 19DCS038
No ratings yet
Dsa Prac 5 19DCS038
16 pages
Map Reduce 101 Basic Template
No ratings yet
Map Reduce 101 Basic Template
1 page
All
No ratings yet
All
11 pages
Advanced Mapreduce
No ratings yet
Advanced Mapreduce
37 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
BDT Lab 6 22mis1067
No ratings yet
BDT Lab 6 22mis1067
13 pages
Source Code For Wordcount
No ratings yet
Source Code For Wordcount
3 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
6 - Simple Wordcount
No ratings yet
6 - Simple Wordcount
2 pages
Hadoop WordCount
No ratings yet
Hadoop WordCount
2 pages
Lab3 BigData-MapReduce
No ratings yet
Lab3 BigData-MapReduce
8 pages
Dsbda 11
No ratings yet
Dsbda 11
15 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
The Applicationsof Machine Learning Techniquesin Medical Data
No ratings yet
The Applicationsof Machine Learning Techniquesin Medical Data
47 pages
1 s2.0 S2772375522000090 STT02
No ratings yet
1 s2.0 S2772375522000090 STT02
24 pages
SPEAKING B1-CẤU TRÚC TỰ SOẠN
No ratings yet
SPEAKING B1-CẤU TRÚC TỰ SOẠN
22 pages
Writing B1-Part 1-T So N-Tham KH o
No ratings yet
Writing B1-Part 1-T So N-Tham KH o
4 pages
Java Lab Manual
No ratings yet
Java Lab Manual
38 pages
ACA Lec2 New
No ratings yet
ACA Lec2 New
44 pages
Ide Sua Code Truc Tiep Tren Host-Vps
No ratings yet
Ide Sua Code Truc Tiep Tren Host-Vps
1 page
Environsciproc 26 00049
No ratings yet
Environsciproc 26 00049
6 pages
Sinh viên đã xem Đề thi e - thi - CTDL - GT - - - HK211.pdf!
No ratings yet
Sinh viên đã xem Đề thi e - thi - CTDL - GT - - - HK211.pdf!
2 pages
2015 12 Software-Flaws
No ratings yet
2015 12 Software-Flaws
21 pages
Project 1 - Simple File Transfer Service
No ratings yet
Project 1 - Simple File Transfer Service
5 pages
Pps All in One by Urself
No ratings yet
Pps All in One by Urself
34 pages
Java Syntax Notes
No ratings yet
Java Syntax Notes
27 pages
Besck104e Module 4
No ratings yet
Besck104e Module 4
32 pages
Empro Python Cookbook
No ratings yet
Empro Python Cookbook
107 pages
Key Basic Statements
No ratings yet
Key Basic Statements
4 pages
VTA Training Course: Topics
No ratings yet
VTA Training Course: Topics
25 pages
Python Programming Session
100% (2)
Python Programming Session
21 pages
Foobar Help
No ratings yet
Foobar Help
7 pages
Chapter-III Data Structures in Python
No ratings yet
Chapter-III Data Structures in Python
129 pages
Java Programming Lab Manual R18 JNTUH 2
No ratings yet
Java Programming Lab Manual R18 JNTUH 2
43 pages
Tcs Ipa Cheatsheet
No ratings yet
Tcs Ipa Cheatsheet
6 pages
Google Python Course Online
No ratings yet
Google Python Course Online
51 pages
MIC Project
No ratings yet
MIC Project
14 pages
Python Programs
No ratings yet
Python Programs
11 pages
Java File Best
No ratings yet
Java File Best
53 pages
Python Tutorial: 1) Easy To Learn and Use
No ratings yet
Python Tutorial: 1) Easy To Learn and Use
7 pages
Unicode Searching Algorithm Using Multilevel Binary Tree Applied On Bangla Unicode
No ratings yet
Unicode Searching Algorithm Using Multilevel Binary Tree Applied On Bangla Unicode
6 pages
Idf v30 Spec
No ratings yet
Idf v30 Spec
41 pages
CS8383 - Object Oriented Programming Laboratory Manual - by LearnEngineering - in
No ratings yet
CS8383 - Object Oriented Programming Laboratory Manual - by LearnEngineering - in
30 pages
Computer Applications Practice Paper ICSE 10th
100% (1)
Computer Applications Practice Paper ICSE 10th
17 pages
BSC Fully Distributed Representation Kanerva 1997
No ratings yet
BSC Fully Distributed Representation Kanerva 1997
8 pages
Icse Sample Paper-4 For Computer Applications
No ratings yet
Icse Sample Paper-4 For Computer Applications
5 pages
Simple Queries in SQL
No ratings yet
Simple Queries in SQL
27 pages
7 Struct
No ratings yet
7 Struct
3 pages
UNIT-II - Structuring The Data, Computations and Program
No ratings yet
UNIT-II - Structuring The Data, Computations and Program
105 pages
Code 24
No ratings yet
Code 24
8 pages
UM0392
No ratings yet
UM0392
23 pages
Um String Code 2 0 e
No ratings yet
Um String Code 2 0 e
46 pages
Sense Talk
No ratings yet
Sense Talk
44 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CTBD Sol02

Uploaded by

CTBD Sol02

Uploaded by

Exercise 2:

1 public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {

1 public static class TokenizerMapper extends Mapper<LongWritable, Text, Text, Text> {

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.