Merge two xdf files in hadoop compute context

384 Views Asked by At

I have two RxXdfData data source and i want to merge them on some column in RxHadoopMR compute context.

Both my xdf data source are big and present on hdfs. How can we merge them?

I tried RxDataStep append option but revolution r complains, it can't take composite xdf files and suggest me to use rxExec instead.

I know this can be done using rxMerge function in local compute context but then i have to do following steps:

  1. Copy data to edge node(local context)
  2. Make .xdf files
  3. Use rxMerge to merge .xdf files
  4. Convert output .xdf file to txt/csv format
  5. Transfer txt/csv files to hdfs
  6. Again use rxImport to convert these text files back to composite xdf files

Such a long process for simple merge is an overkill i suppose.

Please help me with any optimal solution for this problem.

Edit: I have also asked the same question at revolution r support forum @ https://revolutionanalytics.zendesk.com/entries/53777899-Merging-two-composite-xdf-files-

But i haven't received any reply till now.

0

There are 0 best solutions below