Sorting huge XML files

171 Views Asked by At

I have 2 huge XML files (4-5 Gb each). XML format is as follows:

<root>
  <item>
    <id/>
    <elements/>
    <elements/>
    <elements/>
  </item>
</root>

I need to compute whether more <items> have been added or modified! For this I am planning to sort the two files and then proceed from there. To sort, I have following two approaches in mind.

  1. Convert XML files to other format and perform external sort.

  2. Sort using XSLT: I am not sure whether it could be done for such huge files.

I would like to know which of the two approaches is feasible for the problem.

Or if there is a better approach to tackle the problem.

EDIT: I cannot load the entire file on disk, So using "diff" or "bdiff" is not an option.

0

There are 0 best solutions below