100% found this document useful (2 votes)

3K views

Hadoop Training #4: Programming With Hadoop

Learn how to get started writing programs against Hadoop's API. Check http://www.cloudera.com/hadoop-training-basic for training videos.

Uploaded by

Dmytro Shteflyuk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

3K views

Hadoop Training #4: Programming With Hadoop

Learn how to get started writing programs against Hadoop's API. Check http://www.cloudera.com/hadoop-training-basic for training videos.

Uploaded by

Dmytro Shteflyuk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Programming with Hadoop

© 2009 Cloudera, Inc.

Overview
• How to use Hadoop
– Hadoop MapReduce
– Hadoop Streaming

© 2009 Cloudera, Inc.

Some MapReduce Terminology
• Job – A “full program” - an execution of a
Mapper and Reducer across a data set
• Task – An execution of a Mapper or a
Reducer on a slice of data
– a.k.a. Task-In-Progress (TIP)
• Task Attempt – A particular instance of an
attempt to execute a task on a machine

© 2009 Cloudera, Inc.

Terminology Example

• Running “Word Count” across 20 files is

one job
• 20 files to be mapped imply 20 map tasks
+ some number of reduce tasks
• At least 20 map task attempts will be
performed… more if a machine crashes,
etc.

© 2009 Cloudera, Inc.

Task Attempts
• A particular task will be attempted at least once,
possibly more times if it crashes
– If the same input causes crashes over and over, that
input will eventually be abandoned
• Multiple attempts at one task may occur in
parallel with speculative execution turned on
– Task ID from TaskInProgress is not a unique
identifier; don’t use it that way

© 2009 Cloudera, Inc.

MapReduce: High Level

© 2009 Cloudera, Inc.

Nodes, Trackers, Tasks
• Master node runs JobTracker instance,
which accepts Job requests from clients

• TaskTracker instances run on slave nodes

• TaskTracker forks separate Java process

for task instances

© 2009 Cloudera, Inc.

Job Distribution
• MapReduce programs are contained in a Java
“jar” file + an XML file containing serialized
program configuration options
• Running a MapReduce job places these files
into the HDFS and notifies TaskTrackers where
to retrieve the relevant program code

• … Where’s the data distribution?

© 2009 Cloudera, Inc.

Data Distribution
• Implicit in design of MapReduce!
– All mappers are equivalent; so map whatever
data is local to a particular node in HDFS
• If lots of data does happen to pile up on
the same node, nearby nodes will map
instead
– Data transfer is handled implicitly by HDFS

© 2009 Cloudera, Inc.

Configuring With JobConf
• MR Programs have many configurable options
• JobConf objects hold (key, value) components
mapping String ’a
– e.g., “mapred.map.tasks” 20
– JobConf is serialized and distributed before running
the job
• Objects implementing JobConfigurable can
retrieve elements from a JobConf

© 2009 Cloudera, Inc.

What Happens In MapReduce?
Depth First

© 2009 Cloudera, Inc.

Job Launch Process: Client
• Client program creates a JobConf
– Identify classes implementing Mapper and
Reducer interfaces
• JobConf.setMapperClass(), setReducerClass()
– Specify inputs, outputs
• FileInputFormat.addInputPath(conf)
• FileOutputFormat.setOutputPath(conf)
– Optionally, other options too:
• JobConf.setNumReduceTasks(),
JobConf.setOutputFormat()…

© 2009 Cloudera, Inc.

Job Launch Process: JobClient
• Pass JobConf to JobClient.runJob() or
submitJob()
– runJob() blocks, submitJob() does not
• JobClient:
– Determines proper division of input into
InputSplits
– Sends job data to master JobTracker server

© 2009 Cloudera, Inc.

Job Launch Process: JobTracker
• JobTracker:
– Inserts jar and JobConf (serialized to XML) in
shared location
– Posts a JobInProgress to its run queue

© 2009 Cloudera, Inc.

Job Launch Process: TaskTracker
• TaskTrackers running on slave nodes
periodically query JobTracker for work
• Retrieve job-specific jar and config
• Launch task in separate instance of Java
– main() is provided by Hadoop

© 2009 Cloudera, Inc.

Job Launch Process: Task
• TaskTracker.Child.main():
– Sets up the child TaskInProgress attempt
– Reads XML configuration
– Connects back to necessary MapReduce
components via RPC
– Uses TaskRunner to launch user process

© 2009 Cloudera, Inc.

Job Launch Process: TaskRunner
• TaskRunner launches your Mapper
– Task knows ahead of time which InputSplits it
should be mapping
– Calls Mapper once for each record retrieved
from the InputSplit
• Running the Reducer is much the same

© 2009 Cloudera, Inc.

Creating the Mapper
• You provide the instance of Mapper
– Should extend MapReduceBase
• One instance of your Mapper is initialized
per task
– Exists in separate process from all other
instances of Mapper – no data sharing!

© 2009 Cloudera, Inc.

Mapper
• void map(WritableComparable key,
Writable value,
OutputCollector output,
Reporter reporter)

© 2009 Cloudera, Inc.

What is Writable?
• Hadoop defines its own “box” classes for
strings (Text), integers (IntWritable), etc.
• All values are instances of Writable
• All keys are instances of
WritableComparable

© 2009 Cloudera, Inc.

Writing For Cache Coherency
while (more input exists) {
myIntermediate = new intermediate(input);
myIntermediate.process();
export outputs;
}

© 2009 Cloudera, Inc.

Writing For Cache Coherency
myIntermediate = new intermediate (junk);
while (more input exists) {
myIntermediate.setupState(input);
myIntermediate.process();
export outputs;
}

© 2009 Cloudera, Inc.

Writing For Cache Coherency
• Running the GC takes time
• Reusing locations allows better cache
usage (up to 2x performance benefit)
• All keys and values given to you by
Hadoop use this model (share containiner
objects)

© 2009 Cloudera, Inc.

Getting Data To The Mapper

© 2009 Cloudera, Inc.

Reading Data
• Data sets are specified by InputFormats
– Defines input data (e.g., a directory)
– Identifies partitions of the data that form an
InputSplit
– Factory for RecordReader objects to extract
(k, v) records from the input source

© 2009 Cloudera, Inc.

FileInputFormat and Friends

• TextInputFormat – Treats each ‘\n’-

terminated line of a file as a value
• KeyValueTextInputFormat – Maps ‘\n’-
terminated text lines of “k SEP v”
• SequenceFileInputFormat – Binary file of
(k, v) pairs with some add’l metadata
• SequenceFileAsTextInputFormat – Same,
but maps (k.toString(), v.toString())
© 2009 Cloudera, Inc.
Filtering File Inputs
• FileInputFormat will read all files out of a
specified directory and send them to the
mapper
• Delegates filtering this file list to a method
subclasses may override
– e.g., Create your own “xyzFileInputFormat” to
read *.xyz from directory list

Record Readers
• Each InputFormat provides its own
RecordReader implementation
– Provides (unused?) capability multiplexing
• LineRecordReader – Reads a line from a
text file
• KeyValueRecordReader – Used by
KeyValueTextInputFormat

Input Split Size
• FileInputFormat will divide large files into
chunks
– Exact size controlled by mapred.min.split.size
• RecordReaders receive file, offset, and
length of chunk
• Custom InputFormat implementations may
override split size – e.g., “NeverChunkFile”

Sending Data To Reducers
• Map function receives OutputCollector
object
– OutputCollector.collect() takes (k, v) elements
• Any (WritableComparable, Writable) can
be used

Sending Data To The Client
• Reporter object sent to Mapper allows
simple asynchronous feedback
– incrCounter(Enum key, long amount)
– setStatus(String msg)
• Allows self-identification of input
– InputSplit getInputSplit()

!
Partition And Shuffle

Partitioner
• int getPartition(key, val, numPartitions)
– Outputs the partition number for a given key
– One partition == values sent to one Reduce
task
• HashPartitioner used by default
– Uses key.hashCode() to return partition num
• JobConf sets Partitioner implementation

Reduction
• reduce( WritableComparable key,
Iterator values,
OutputCollector output,
Reporter reporter)
• Keys & values sent to one partition all go
to the same reduce task
• Calls are sorted by key – “earlier” keys are
reduced and output before “later” keys
• Remember – values.next() always returns
the same object, different data!
© 2009 Cloudera, Inc.
"
Finally: Writing The Output

OutputFormat
• Analogous to InputFormat
• TextOutputFormat – Writes “key val\n”
strings to output file
• SequenceFileOutputFormat – Uses a
binary format to pack (k, v) pairs
• NullOutputFormat – Discards output

Conclusions
• That’s the Hadoop flow!
• Lots of flexibility to override components,
customize inputs and outputs
• Using custom-built binary formats allows
high-speed data movement

Hadoop Streaming
Motivation
• You want to use a scripting language
– Faster development time
– Easier to read, debug
– Use existing libraries
• You (still) have lots of data

HadoopStreaming
• Interfaces Hadoop MapReduce with
arbitrary program code
• Uses stdin and stdout for data flow
• You define a separate program for each of
mapper, reducer

Data format
• Input (key, val) pairs sent in as lines of
input
key (tab) val (newline)
• Data naturally transmitted as text
• You emit lines of the same form on stdout
for output (key, val) pairs.

Example: map (k, v) (v, k)
#!/usr/bin/env python
import sys
while True:
line = sys.stdin.readline()
if len(line) == 0:
break
(k, v) = line.strip().split(“\t”)
print v + “\t” + k

Launching Streaming Jobs
• Special jar contains streaming “job”
• Arguments select mapper, reducer,
format…
• Can also specify Java classes
– Note: must be in Hadoop “internal” library

Reusing programs
• Identity mapper/reducer: cat
• Summing: wc
• Field selection: cut
• Filtering: awk

Streaming Conclusions
• Fast, simple, powerful
• Low-overhead way to get started with
Hadoop
• Resources:
– http://wiki.apache.org/hadoop/HadoopStreaming
– http://hadoop.apache.org/core/docs/current/streaming
.html

Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Cloudera Administrator Training For Apache Hadoop
No ratings yet
Cloudera Administrator Training For Apache Hadoop
5 pages
Big Data & Hadoop Training Material 0 1 PDF
50% (2)
Big Data & Hadoop Training Material 0 1 PDF
168 pages
Fast Data Processing with Spark 2 - Third Edition
From Everand
Fast Data Processing with Spark 2 - Third Edition
Krishna Sankar
No ratings yet
Hadoop Training #1: Thinking at Scale
100% (1)
Hadoop Training #1: Thinking at Scale
20 pages
Hadoop Training #5: MapReduce Algorithm
100% (2)
Hadoop Training #5: MapReduce Algorithm
31 pages
Hadoop Tutorial
50% (2)
Hadoop Tutorial
199 pages
Map Reduce
No ratings yet
Map Reduce
10 pages
Hadoop Lab
100% (1)
Hadoop Lab
32 pages
Guided By:: Miss. Rupali Zambre
No ratings yet
Guided By:: Miss. Rupali Zambre
20 pages
Apache Hue-Cloudera
No ratings yet
Apache Hue-Cloudera
63 pages
Hadoop Notes
No ratings yet
Hadoop Notes
11 pages
Cloudera Kafka
100% (1)
Cloudera Kafka
50 pages
Facebook Hive POC
No ratings yet
Facebook Hive POC
18 pages
BIG DATA & Hadoop Interview Questions With Answers
No ratings yet
BIG DATA & Hadoop Interview Questions With Answers
9 pages
Some of The Frequently Asked Interview Questions For Hadoop Developers Are
100% (1)
Some of The Frequently Asked Interview Questions For Hadoop Developers Are
72 pages
Big Data and Hadoop: by - Ujjwal Kumar Gupta
No ratings yet
Big Data and Hadoop: by - Ujjwal Kumar Gupta
57 pages
250 Hadoop Interview Questions and Answers For Experienced Hadoop Developers - Hadoop Online Tutorials
No ratings yet
250 Hadoop Interview Questions and Answers For Experienced Hadoop Developers - Hadoop Online Tutorials
35 pages
Public - Crash Course - Apache Spark - Berlin - 2018 PDF
No ratings yet
Public - Crash Course - Apache Spark - Berlin - 2018 PDF
76 pages
Hive Interview Questions Answers
No ratings yet
Hive Interview Questions Answers
6 pages
Hadoop and Related Tools
No ratings yet
Hadoop and Related Tools
57 pages
AaxHadoop Interview Questions and Answers
No ratings yet
AaxHadoop Interview Questions and Answers
37 pages
Distributed Database Systems: - Spark I
No ratings yet
Distributed Database Systems: - Spark I
59 pages
Kafka Cheat Sheets
No ratings yet
Kafka Cheat Sheets
1 page
Apache Cassandra Sample Resume
No ratings yet
Apache Cassandra Sample Resume
17 pages
Spark Details
No ratings yet
Spark Details
11 pages
Hadoop and Java Ques - Ans
No ratings yet
Hadoop and Java Ques - Ans
222 pages
Hadoop Security S360 2015v8 PDF
No ratings yet
Hadoop Security S360 2015v8 PDF
27 pages
Sqoop User Guide
No ratings yet
Sqoop User Guide
58 pages
MapReduce Example
No ratings yet
MapReduce Example
3 pages
Shihab Alkaff PDF
No ratings yet
Shihab Alkaff PDF
6 pages
Ajay Singh - Hadoop Resume
67% (3)
Ajay Singh - Hadoop Resume
2 pages
Hadoop and Mapreduce
No ratings yet
Hadoop and Mapreduce
21 pages
Cloudera Administration Study Guide
No ratings yet
Cloudera Administration Study Guide
3 pages
Hbase: Q) What Is Hbase ?
No ratings yet
Hbase: Q) What Is Hbase ?
15 pages
9 Sqoop Notes
No ratings yet
9 Sqoop Notes
17 pages
Hadoop Interview Questions
No ratings yet
Hadoop Interview Questions
14 pages
Flink
No ratings yet
Flink
31 pages
Cloudera Spark
No ratings yet
Cloudera Spark
70 pages
HBase Administration Cookbook
From Everand
HBase Administration Cookbook
Yifeng Jiang
No ratings yet
Hadoop Overview Training Material
No ratings yet
Hadoop Overview Training Material
44 pages
Hadoop Illuminated
100% (1)
Hadoop Illuminated
72 pages
Hadoop Questions
No ratings yet
Hadoop Questions
41 pages
Spark Interview Questions 1713805760
No ratings yet
Spark Interview Questions 1713805760
40 pages
Fundamentals of Big Data Engineering: A Guide To The
No ratings yet
Fundamentals of Big Data Engineering: A Guide To The
14 pages
Hadoop Administrator Interview Questions: Cloudera® Enterprise Version
No ratings yet
Hadoop Administrator Interview Questions: Cloudera® Enterprise Version
13 pages
Apache Airflow On Docker For Complete Beginners - Justin Gage - Medium
No ratings yet
Apache Airflow On Docker For Complete Beginners - Justin Gage - Medium
12 pages
Tutorial-HDP-Administration V III
100% (1)
Tutorial-HDP-Administration V III
274 pages
Hadoop For Windows Succinctly PDF
No ratings yet
Hadoop For Windows Succinctly PDF
148 pages
Spark Sample Resume 2
100% (1)
Spark Sample Resume 2
7 pages
Introduction To Apache Spark (Spark) : - by Praveen
No ratings yet
Introduction To Apache Spark (Spark) : - by Praveen
19 pages
Hadoop Interviews Q
No ratings yet
Hadoop Interviews Q
9 pages
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
4/5 (2)
Ultimate AWS Certified Solutions Architect Associate Exam Guide: Master Designing Resilient, Scalable Architectures with Core and Advanced AWS Services to Crack the SAA-C03 Certification (English Edition)
From Everand
Ultimate AWS Certified Solutions Architect Associate Exam Guide: Master Designing Resilient, Scalable Architectures with Core and Advanced AWS Services to Crack the SAA-C03 Certification (English Edition)
Venkata Sasi Kanumuri
No ratings yet
Learn Hbase in 24 Hours
From Everand
Learn Hbase in 24 Hours
Alex Nordeen
No ratings yet
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Mastering the Art of Node.js Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Node.js Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Monitoring Hadoop
From Everand
Monitoring Hadoop
Gurmukh Singh
No ratings yet
SCRUM: Mastering Agile Project Management for Exceptional Results (2023 Guide for Beginners)
From Everand
SCRUM: Mastering Agile Project Management for Exceptional Results (2023 Guide for Beginners)
Whitney Soto
No ratings yet
Kubernetes A Complete Guide
From Everand
Kubernetes A Complete Guide
Gerardus Blokdyk
No ratings yet
Why RSA Works PDF
No ratings yet
Why RSA Works PDF
19 pages
Redis Cluster
67% (3)
Redis Cluster
17 pages
Understanding The Top 5 Redis Performance Metrics
No ratings yet
Understanding The Top 5 Redis Performance Metrics
22 pages
Redis Cluster
67% (3)
Redis Cluster
17 pages
Go Programming
80% (5)
Go Programming
60 pages
CSS 3 Help Cheat Sheet
100% (1)
CSS 3 Help Cheat Sheet
1 page
Window Vista Business 20070810
No ratings yet
Window Vista Business 20070810
29 pages
Authors: Thanks To:: Miek Gieben Go Authors Google Go Nuts Mailing List
No ratings yet
Authors: Thanks To:: Miek Gieben Go Authors Google Go Nuts Mailing List
272 pages
Backup Strategies With MySQL Enterprise Backup
No ratings yet
Backup Strategies With MySQL Enterprise Backup
33 pages
Sigmod278 Silberstein
No ratings yet
Sigmod278 Silberstein
12 pages
Scaling MySQL Writes Through Partitioning
No ratings yet
Scaling MySQL Writes Through Partitioning
38 pages
Building TweetReach With Sinatra, Tokyo Cabinet and Grackle
No ratings yet
Building TweetReach With Sinatra, Tokyo Cabinet and Grackle
21 pages
CSS 2.1 Help Cheat Sheet
No ratings yet
CSS 2.1 Help Cheat Sheet
1 page
Calpont InfiniDB Administrator Guide (For Version 1.0.3)
100% (2)
Calpont InfiniDB Administrator Guide (For Version 1.0.3)
106 pages
Performance Tuning For The InfiniDB Analytics Database (For Version 1.0.3)
100% (1)
Performance Tuning For The InfiniDB Analytics Database (For Version 1.0.3)
72 pages
RFM: A Precursor To Data Mining
No ratings yet
RFM: A Precursor To Data Mining
10 pages
The Complete Google Analytics Power User Guide PDF
100% (1)
The Complete Google Analytics Power User Guide PDF
45 pages
Amdahl's Law in The Multicore Era
100% (1)
Amdahl's Law in The Multicore Era
6 pages
Scribd Architecture Overview
100% (20)
Scribd Architecture Overview
19 pages
Scaling Rails Applications in The Cloud
100% (3)
Scaling Rails Applications in The Cloud
59 pages
Amdahl's Law in The Multicore Era - HPCA Keynote 02/2008
No ratings yet
Amdahl's Law in The Multicore Era - HPCA Keynote 02/2008
61 pages
Diagnosing Organizational System CH 6
No ratings yet
Diagnosing Organizational System CH 6
12 pages
B. SC in Food and Process Engineering: Industrial Training Report at
No ratings yet
B. SC in Food and Process Engineering: Industrial Training Report at
31 pages
Lecture 6 - Linear Regression and Correlation
No ratings yet
Lecture 6 - Linear Regression and Correlation
40 pages
Competition Concerns in Shipping Conferences
No ratings yet
Competition Concerns in Shipping Conferences
125 pages
Marshall I H - Historical Criticism
No ratings yet
Marshall I H - Historical Criticism
12 pages
Dokumen Pub Animal Killer Transmission of War Trauma From One Generation
No ratings yet
Dokumen Pub Animal Killer Transmission of War Trauma From One Generation
125 pages
Wade C. Sherbrooke y George A. Middendorf Lll.2001.variabilidad de Chorros de Sangre en Lagartos (Phrynosoma)
No ratings yet
Wade C. Sherbrooke y George A. Middendorf Lll.2001.variabilidad de Chorros de Sangre en Lagartos (Phrynosoma)
10 pages
MediaTek Dimensity 8100 8000 Infographic
No ratings yet
MediaTek Dimensity 8100 8000 Infographic
1 page
College of Natural Sciences: Arbaminch University
No ratings yet
College of Natural Sciences: Arbaminch University
21 pages
30 Đề Luyện Thi Vào 10 Năm Học 2023-2024
No ratings yet
30 Đề Luyện Thi Vào 10 Năm Học 2023-2024
105 pages
Dev PRGM 2023-24 (Nov-Feb)
No ratings yet
Dev PRGM 2023-24 (Nov-Feb)
468 pages
Electronic Gadget - Revised
No ratings yet
Electronic Gadget - Revised
34 pages
Design and Fabrication of Eccentric Punching Machine
No ratings yet
Design and Fabrication of Eccentric Punching Machine
12 pages
Malaysian Standard - Foundation Design MS 1756 2004
No ratings yet
Malaysian Standard - Foundation Design MS 1756 2004
8 pages
Nid2023 Part B
No ratings yet
Nid2023 Part B
6 pages
The Yoga Ladder Student's Booklet Latest
100% (1)
The Yoga Ladder Student's Booklet Latest
4 pages
CDC
No ratings yet
CDC
116 pages
NIMS University Jaipur - Ph.D. Economics Selection Process, Course Fee, Placement
No ratings yet
NIMS University Jaipur - Ph.D. Economics Selection Process, Course Fee, Placement
6 pages
Chap1-Managing in The Digital World
No ratings yet
Chap1-Managing in The Digital World
37 pages
PhysioEx Exercise 5 Activity 4
No ratings yet
PhysioEx Exercise 5 Activity 4
3 pages
Questionnaire: Techers' Perception Regarding Motivational Techniques in Personality Development
100% (1)
Questionnaire: Techers' Perception Regarding Motivational Techniques in Personality Development
4 pages
Exposed by Kimberly Marcus
No ratings yet
Exposed by Kimberly Marcus
35 pages
Venerable Nyanavimala
100% (1)
Venerable Nyanavimala
27 pages
Chapter 1 PDF
No ratings yet
Chapter 1 PDF
30 pages
Vinaya Questions
No ratings yet
Vinaya Questions
8 pages
Gear Train Experiment
No ratings yet
Gear Train Experiment
8 pages
Sarvajanik College of Engineering and Technology
No ratings yet
Sarvajanik College of Engineering and Technology
23 pages
Syllabus PA142 1stsem2015-2016 KBB
No ratings yet
Syllabus PA142 1stsem2015-2016 KBB
7 pages
Mahayana Prayers and Poetry
No ratings yet
Mahayana Prayers and Poetry
89 pages
Air Cargo Guide
100% (2)
Air Cargo Guide
211 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Hadoop Training #4: Programming With Hadoop

Uploaded by

Hadoop Training #4: Programming With Hadoop

Uploaded by

Programming with Hadoop

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

• Running “Word Count” across 20 files is

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

• TaskTracker instances run on slave nodes

• TaskTracker forks separate Java process

© 2009 Cloudera, Inc.

• … Where’s the data distribution?

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

• TextInputFormat – Treats each ‘\n’-

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

© 2009 Cloudera, Inc.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.