How to store Partitioned data using pig in RC Format?

607 Views Asked by At

I was wondering if there is a UDF or something that can store my data in a partitioned fashion in RC Format. I know there is org.apache.pig.piggybank.storage.MultiStorage but it only does it for some compression format. I want to store my data in RC Format but using the same partitioned storage structure that MultiStorage provides.

Thanks, imtiaz

1

There are 1 best solutions below

0
On

There is no such solution available either in piggybank or some other alternative. I had faced a similar issue. But dropped the implementation due to some other requirements.Only solution available is to extend the MultiStorage udf to provide RC storage format.

Twitter has open sourced its RC file storage. You can take help from it. http://grepcode.com/file/repo1.maven.org/maven2/com.twitter.elephantbird/elephant-bird-rcfile/3.0.8/com/twitter/elephantbird/pig/store/RCFilePigStorage.java