Tuple,Bag,Map Functions
Apache Pig supports various types of Tuple, Bag, Map Functions such as TOBAG, TOP, TOTUPLE, and TOMAP to perform a different type of operation.
The following is the list of Tuple, Bag, Map functions supported by Apache Pig.
| Sr No | Functions | Description | 1 | TOBAG() | This function is used to convert two or more expressions into a bag. | 2 | TOP() | This function is used to get the top N tuples of a relation. | 3 | TOTUPLE() | This function is used to convert one or more expressions into a tuple. | 4 | TOMAP() | This function is used to convert the key-value pairs into a Map. | 
|---|
Let us see a couple of examples.
TOBAG()
TOBAG function is used to convert one or more expressions to individual tuples which are then placed in a bag.
Syntax:
    
        grunt> TOBAG(expression [, expression ...])
    
    
    
    
To perform this operation we have used the “studentdata.txt” dataset. We will put “studentdata.txt” in the HDFS location “/pigexample/” from the local file system. Content of “studentd
Content of “studentdata.txt”:
    1,Chanel,Shawnee,KS,39
    2,Ezekiel,Easton,MD,37
    3,Willow,New York,NY,40
    4,Bernardo,Conroe,TX,38
    5,Ammie,Columbus,OH,38
    6,Francine,Las Cruces,NM,38
    7,Ernie,Ridgefield Park,NJ,38
    8,Albina,Dunellen,NJ,56
    9,Alishia,New York,NY,34
    10,Solange,Metairie,LA,54
 
        
    We will load “studentdata.txt” from the local filesystem into HDFS “/pigexample/” using the below commands.
Command:
    
        $hadoop fs -copyFromLocal /home/cloudduggu/pig/tutorial/studentdata.txt /pigexample/
    
    
    
    
Now we will create relation "studentdata" and load data from HDFS to Pig.
Command:
    
        grunt> studentdata = LOAD '/pigexample/studentdata.txt' USING PigStorage(',')
        
   as (studentid:int,firstname:chararray,lastname:chararray,city:chararray,gpa:int);
    
    
    
    
Now we will convert each record (studentid,firstname,lastname,city,gpa) into tuples and print output using the DUMP operator.
Command:
    
        grunt> tobagdata = FOREACH studentdata GENERATE TOBAG (studentid,firstname,lastname,city,gpa);
    
    
    
        grunt> DUMP tobagdata;
    
    
    
    Output:
     
    
    
    
    TOTUPLE()
    
 
    TOTUPLE()
The TOTUPLE function is used to convert one or more expressions to a tuple.
Syntax:
    
        grunt> TOTUPLE(expression [, expression ...])
    
    
    
    
We will use the relation “studentdata” which is created in the TOBAG section and convert each record (studentid,firstname,lastname,city,gpa) into tuples and print output using the DUMP operator.
Command:
    
        grunt> totupeldata = FOREACH studentdata GENERATE TOTUPLE (studentid,firstname,lastname,city,gpa);
    
    
    
        grunt> DUMP totupeldata;
    
    
    
    Output:
     
    
    
    
 
     
