See the following two commands with output:
$ wc < myfile.txt
4 4 34
$ cat myfile.txt | wc
4 4 34
My understanding is that these two both connect the stdin of the wc process with the content stream of myfile.txt. But why is the output padded in one case, and not in the other? How does wc tell the difference between the two? Is it not just reading from stdin?
Short answer: because with
wc < myfile.txt, thewcprogram has direct access to the file, and can do things besides reading from it. Specifically, it can get the file's size (and it bases the output column width on that). Withcat myfile.txt | wc, it can't do that, so it uses wide columns to make sure there's enough room.Long answer:
wctries to provide nicely columnated output:In order to estimate how wide its columns need to be, the GNU version of
wcrunsstat()(orfstat()) on all of its input files (before actually reading them to get the detailed counts), and uses their sizes to determine how large the word/line/character counts might get, and hence how wide it might need to make the columns to have room for all those digits.If it can't get any of the input files' sizes (e.g. because they're not plain files, but pipes or something similar), it "assumes the worst", and forces a minimum width of 7 digits. So anytime any of the inputs are pipes or anything like that, you're going to get at-least-7-character-wide columns.
Some examples: