Is there any way to use GNU Coreutils sort with 64bit numbers stored in binary file?
If file wasn't binary then sort -n
is the solution, but I didn't find any options to use it with binary data.
File is quite large (~100GB) and if it is possible I don't want to make its' text (non-binary) copy.
Sample of data:
$ xxd file
00292e0: 4036 1eb7 6888 d319 de6b 7402 9ca9 f116 @6..h....kt.....
00292f0: db68 7f05 199f 9d36 cf01 cb28 e49f 1116 .h.....6...(....
0029300: 0c7c 8b55 2963 ef0c 277a f2b0 38d7 2b19 .|.U)c..'z..8.+.
0029310: c83b 2614 4327 d838 820c 1bb8 444f 1731 .;&.C'.8....DO.1
0029320: 1695 cab3 cd12 092a 0691 d7e4 5fcc b01d .......*...._...
0029330: b12b 7c1b a209 7c1c 568a 125c 541c d334 .+|...|.V..\T..4
0029340: 09a3 ecbc 8370 e205 9265 7759 a378 4e2f .....p...ewY.xN/
sort(1)
will not help you here. For a small file it could be possible to split your file into lines and feed it tosort(1)
, but not for 100G file of course.The answer to this question on Serverfault has a link of the tool written for solving exactly your task. You can check the github project there (it seems to be written in Go so you will need to install a compiler if you decide to use it).
Quick googling does not find any other popular tool for this task written on some more popular language (and it surprises me a bit as the task itself is just a merge sort that thousands of students implement each year on their CS courses, but that's an off-topic).