Apache Drill queries can read JSON data format easily. Drill considers every object of JSON data as a row in the Drill table. There is a sample JSON file that comes with the Drill installation package that we can access using cp and see how it is behaving.
Query JSON File in Apache Drill
We can use the sample file "employee.json" that comes with the default package of JAR file foodmart-data-json-0.4.jar. The location of this JAR file is "/home/cloudduggu/drill/jars/3rdparty/foodmart-data-json-0.4.jar". To access "employee.json" file we will use cp.
Let's start the Drill shell in embedded mode and run the below query to fetch 5 records from "employee.json".
Command:
Output:
We can run the same query in Drill Web UI and get the same result as mentioned below.
Create JSON File in Apache Drill
Let's create a new JSON file with the name "emp_detail.json". This file will contain the following JSON records and stored at the location "/home/cloudduggu/drill/sample-data/emp_detail.json".
{
"EMP_ID" : "677509",
"emp_name" : "Lois",
"emp_age" : 36,
"gender" : "female",
"Date of Joining" : "11/24/2003",
"county" : "Denver",
"pincode" : 80224
}
{
"EMP_ID" : "940761",
"emp_name" : "Brenda",
"emp_age" : 40,
"gender" : "female",
"Date of Joining" : "7/27/2008",
"county" : "De Soto",
"pincode" : 71078
}
{
"EMP_ID" : "428945",
"emp_name" : "Joe",
"emp_age" : 34,
"gender" : "male",
"Date of Joining" : "08/03/2016",
"county" : "Clinton",
"pincode" : 46057
}
{
"EMP_ID" : "193819",
"emp_name" : "Benjamin",
"emp_age" : 45,
"gender" : "male",
"Date of Joining" : "7/25/2013",
"county" : "Waupaca",
"pincode" : 54940
}
{
"EMP_ID" : "408351",
"emp_name" : "Diane",
"emp_age" : 45,
"gender" : "female",
"Date of Joining" : "4/16/1999",
"county" : "Crawford",
"pincode" : 16328
}
{
"EMP_ID" : "499687",
"emp_name" : "Patrick",
"emp_age" : 50,
"gender" : "male",
"Date of Joining" : "7/22/2005",
"county" : "Madison",
"pincode" : 50155
}
{
"EMP_ID" : "539712",
"emp_name" : "Nancy",
"emp_age" : 45,
"gender" : "female",
"Date of Joining" : "9/14/2016",
"county" : "Fulton",
"pincode" : 30334
}
{
"EMP_ID" : "380086",
"emp_name" : "Carol",
"emp_age" : 40,
"gender" : "female",
"Date of Joining" : "1/28/1983",
"county" : "Clinton",
"pincode" : 45107
}
{
"EMP_ID" : "477616",
"emp_name" : "Frances",
"emp_age" : 33,
"gender" : "female",
"Date of Joining" : "4/27/1994",
"county" : "Starr",
"pincode" : 78536
}
{
"EMP_ID" : "162402",
"emp_name" : "Diana",
"emp_age" : 43,
"gender" : "female",
"Date of Joining" : "2/17/2014",
"county" : "Carroll",
"pincode" : 72632
}
Command:
We have used the nano editor to create the JSON file and put the above records in the file.
Output:
Read JSON File in Apache Drill
Now we will query the "emp_detail.json" file using the default storage plugin dfs.
Command:
Output:
We can run the same query in Drill Web UI and as mentioned below.