I have the following csv files with 5 columns and a number of rows. But I’m showing only first 6 rows.
Date,Food,Vitamin,Protein,NumStudents 01/01/17, Pasta, A, Yes, 560 01/01/17, Pizza, A, Yes, 730 01/01/17, Burrito, C, Yes, 240 02/01/17, Pizza, A, Yes, 340 02/01/17, Pasta, B, Yes, 450 02/01/17, Beef, B, Yes, 450
Now I want to find the sum of NumStudents on a particular day who had only Pizza and Pasta.
In essence for
01/01/17 I only have to sum NumStudents for Pizza and Pasta but not Burrito.
01/01/17 1290 02/01/17 790
Output I’m getting
01/01/17 1530 02/01/17 1240
In my code I’m able to sum NumStudents for all 3 types of food but don’t know how to selectively exclude some type of food from my composite key in mapper. Any idea how should I go about it?