Grep with redirect is giving null byte

152 Views Asked by At

My grep command is generating a NUL byte at the end of its output.

I have file.xml which contains only:

<Game>
    <Player p1="Bob"/>
    <Player p2="Fred"/>
</Game>

Now running grep -Pzo '<Game>(\n|.)*?(</Game>)' gives the expected output:

<Game>
        <Player p1="Bob"/>
        <Player p2="Fred"/>
</Game>

But redirecting the output with grep -Pzo '<Game>(\n|.)*?(</Game>)' file.xml > out.md shows the NUL byte at the end of the file when opened in Notepad++ & opens as a binary file in Sublime:

3c47 616d 653e 0a09 3c50 6c61 7965 7220
7031 3d22 426f 6222 2f3e 0a09 3c50 6c61
7965 7220 7032 3d22 4672 6564 222f 3e0a
3c2f 4761 6d65 3e00 

This doesn't happen with other grep commands such as grep -rlF "Game" > out.md.

1

There are 1 best solutions below

0
On

Don't know which platform and grep version using, but I would just omit the -z option:

From GNU grep 3.0 doc:

-z--null-data   
Treat input and output data as sequences of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. Like the -Z or --null option, this option can be used with commands like ‘sort -z’ to process arbitrary file names. 

HEX of file.xml:

0000000: 3c47 616d 653e 0a20 2020 203c 506c 6179  <Game>.    <Play
0000010: 6572 2070 313d 2242 6f62 222f 3e0a 2020  er p1="Bob"/>.
0000020: 2020 3c50 6c61 7965 7220 7032 3d22 4672    <Player p2="Fr
0000030: 6564 222f 3e0a 3c2f 4761 6d65 3e0a       ed"/>.</Game>.

So running:

grep -Po '<Game>(\n|.)*?(</Game>)' file.xml > out.md

HEX of out.md:

0000000: 3c47 616d 653e 0a20 2020 203c 506c 6179  <Game>.    <Play
0000010: 6572 2070 313d 2242 6f62 222f 3e0a 2020  er p1="Bob"/>.
0000020: 2020 3c50 6c61 7965 7220 7032 3d22 4672    <Player p2="Fr
0000030: 6564 222f 3e0a 3c2f 4761 6d65 3e0a       ed"/>.</Game>.