Splitting string into a new line in apache-pig

33 Views Asked by At

I want to split the string in a dataset that is joined by a backslash (/) into a new line.

The example datatset is:

(David Jones / John Smith)

I want the result to be:

(David Jones)
(John Smith)

The code i have written is:

A = FOREACH data GENERATE FLATTEN(STRSPLIT(name,'/',2));
DUMP A;

However the result i'm getting in the terminal is:

(David Jones, John Smith)
2

There are 2 best solutions below

0
OneCricketeer On

STRSPLIT creates a tuple in the same row. Flattening it will collapse it back together.

I suggest trying without the flatten

0
Shubham Garg On

You should use TOKENIZE instead of STRSPLIT

Code:

A = LOAD 'input.txt' AS (name:chararray);
B = FOREACH A GENERATE FLATTEN(TOKENIZE(name,'/'));
DUMP B;

Contents of input.txt:

David Jones/John Smith

Output:

(David Jones)
(John Smith)