BDA Lab Manual-2
BDA Lab Manual-2
file name. We have three sample data files employee_info_t.csv employee_info_2.csv employee_info_3.csv we will be creating inverted index as below so that it will be faster to search employee details based on the first name. AARON employee info_1.csv employee_info_2.csv ABADJR employee _info_i.csv ABARCA employee_info_1.csv First Name, Last Name, Job Titles, Department, Full or Part-Time, Salary or Hourly, Typical Hours, Annual Salary, Hourly Rate dubert,tomasz ,paramedic i/c,fire fsalary, $1080.00, edwards,tim p, lieutenant fir,f,salary,,114846.00, elkins eric ,sergeant,police,f salary,,104628.00, estrada lus f,palice officer, police,fsalary,,96060.00, ewing,marie a,clerk ili,police,f,salary,,53076.00, finn, sean pfirfighter fire fsalary,,87006.00, fitch jordan m,law clerk,lawf hourly, 35,,14.51 Mapper Code Inthe mapper class we are splitting the input data using comma as a delimiter and then checking for some invalid data to ignore it in the if condition First Name of employee is stored in the Oth index so we are fetching the first name of employee using the Oth index.We also require the file name to store as the value against the first name sowe are fetching the file name that is processed in the mapper using the context.getinputSplit()).getPath().getName() and adding it to the value. import java.io.lOException; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.ib.input FileSplit; Public class InvertedindexNameMapper extends Mapper