Apache Pig GROUP Operator

by Online Tutorials Library July 14, 2022

Apache Pig Group Operator

The Apache Pig GROUP operator is used to group the data in one or more relations. It groups the tuples that contain a similar group key. If the group key has more than one field, it treats as tuple otherwise it will be the same type as that of the group key. In a result, it provides a relation that contains one tuple per group.

Example of Group Operator

In this example, we group the given data on the basis of the last name.

Steps to execute Group Operator

Create a text file in your local machine and write some text into it.

Apache Pig Group Operator

Check the text written in the piginput2.txt file.

Apache Pig Group Operator

Upload the piginput2.txt file on HDFS in the specific directory.

Apache Pig Group Operator

Open the pig MapReduce run mode.

Load the data into the bag.

  grunt> A = LOAD ‘/pigexample/piginput2.txt’ USING PigStorage(‘,’) AS (fname:chararray,l_name:chararray,id:int);  

Now execute and verify the data.

Apache Pig Group Operator

Let us group the data on the basis of l_name.

Now, execute and verify the data.

Apache Pig Group Operator

Here, we got the desired output.

Next TopicLIMIT Operator

Apache Pig GROUP Operator

Apache Pig Group Operator

Example of Group Operator

Steps to execute Group Operator

Polymorphism in PHP

What is Polymer.js

You may also like