def 24 word
abc 2 word
I write the above data to a file called tmp
and sort by the first column. As expected, the second line becomes the first:
/tmp/sort: cat tmp
def 24 word
abc 2 word
/tmp/sort: sort -k1,1 tmp
abc 2 word
def 24 word
But when I sort using the third column as the primary key and the first column as the secondary key, the lines are not reordered:
/tmp/sort: sort -k3,3 -k1,1 tmp
def 24 word
abc 2 word
# ^^ WRONG
I suspect that this has something to do with the fact that there is 2 spaces between the second and third columns of the first line, and 3 spaces for the second line. Indeed, once I replace one of the 3 spaces of the second line with a non-space character, the sort works correctly:
/tmp/sort: cat tmp
def 24 word
abc 2x word
/tmp/sort: sort -k3,3 -k1,1 tmp
abc 2x word
def 24 word
# ^^ CORRECT
Does anyone have an explanation for this behavior?
EDIT: even more bizarrely, the sort works correctly when I specify -n (numeric) for the first column, even though it's not actually a number
/tmp/sort: cat tmp
def 24 word
abc 2 word
/tmp/sort: sort -k3,3 -k1,1 tmp
def 24 word
abc 2 word
# ^^ WRONG
/tmp/sort: sort -k3,3 -nk1,1 tmp
abc 2 word
def 24 word
# ^^ CORRECT
I've figured out the solution: use
-b
from the man page:
I suspect that without
-b
,sort
was treating the excess spaces after the space after the second field as being part of the third field. Hence, the third fields for the two lines wereword
andword
, which are not the same. With-b
, the third fields for both are identical:word
, and so the first field is the determinant, as desired.