Change the header in a huge file without rewriting the whole file

3.8k Views Asked by At

I got a problem with huge data files. The headers of these files count 79 lines. After these lines the data is binary. I want to change the header via bash script. I used sed until now. The problem is that when i just want to change the header, either the whole data file is read and written (1) or the process is aborted after those 79 lines (2). I need to change the header in 8 lines. Therefore my commands referring to that problem are:

(1)

sed -i  '1,79 s/ConverterPositionM2C1='"$M2C1"'/ConverterPositionM2C1='"$M2C1N"'/' $FileName

(2)

sed -i -e '79q' -e 's/ConverterPositionM1C2='"$M1C2"'/ConverterPositionM1C2='"$M1C2N"'/' $FileName

Using the first command works in all cases but it takes hours for huge files (20GB). Because i want to change parameters in 8 lines, which differ from each another, I need to put 8 of these commands in my script and the execution of the script takes even longer. The second command only copies the header. The data is lost.

2

There are 2 best solutions below

1
On BEST ANSWER

If you do not want to rewrite the binary data, then the length of the header in bytes must not change. You can do this by padding with spaces or zeroes or whatever works for your format.

The first step is to create the header that you want. You may use something like:

sed 's/old/new/; 79q' "$FileName" >newhdr

Replace s/old/new/ with whatever substitution commands you need but remember that, when all is done, the length of the header in bytes must not change. 79q tells sed to stop after it reads the 79th line. The new header is written to a temporary file called newhdr.

If newhdr has the form that you want, then we need to change it in-place in $FileName. This can be done with dd as follows:

dd conv=notrunc obs=1 if=newhdr of="$NewFile"

conv=notrunct tells dd not to truncate the output file. obs=1 tells it to use single byte blocks. if specifies the input file and of specifies the output file. After this command is executed, $NewFile will be updated in place.

0
On

I was informed about the solution. In, Unix world, you can

echo $NEW_HEADER | cat - <(dd if=$YOUR_FILE bs=1 skip=$HEADER_SIZE_TO_STRIP)

This reads your file, strips the header amount of bytes and prefixes the file with new header (eol included).