Loading json using elephantbird - error with simple task

69 Views Asked by At

I have problem with simply loading data to test and analyze. I'm using

https://www.reddit.com/r/datasets/comments/3oiv9z/reddit_september_comment_archive_is_now_available/

after using getting only 10000 lines from this file I try to load them to pig.

Even something simple like this return error.

REGISTER '/user/cloudera/json-simple-1.1.1.jar'
REGISTER '/user/cloudera/elephant-bird-pig-4.1.jar'
REGISTER '/user/cloudera/elephant-bird-hadoop-compat-4.1.jar'

a = LOAD '/user/cloudera/top' USING com.twitter.elephantbird.pig.load.JsonLoader() as (json:map[]);

Error code:

Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]

1

There are 1 best solutions below

0
On

Try running this

REGISTER 'elephant-bird-pig-4.1.jar';
REGISTER 'elephant-bird-hadoop-compat-4.1.jar';

input = LOAD '/input/file' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS input_map;