How to combine tr with xargs and cut to squeeze repeats

128 Views Asked by At

The top answer to this question demonstrates that cut can be used with tr to cut based on repeated spaces with

< file tr -s ' ' | cut -d ' ' -f 8

I want to get the remotes of several Git repos in a directory and am attempting to extract the remote URL fields from each with the following:

ls | xargs -I{} git -C {} remote -vv | sed -n 'p;n' | tr -s " " | cut -d ' ' -f1

However, this results in (for example) the following output, where I can see that two consecutive spaces (Unicode code point 32) are retained:

origin    https://github.com/jik876/hifi-gan.git
upstream  https://github.com/NVIDIA/NeMo.git
origin    https://github.com/NVIDIA/tacotron2.git

(I have also using xargs with tr)

What am I missing here?

2

There are 2 best solutions below

2
deribaucourt On BEST ANSWER

The output of git remote contains tabs instead of spaces. Use expand to replace them with spaces in your script:

ls | xargs -I{} git -C {} remote -vv | sed -n 'p;n' | expand | tr -s " " | cut -d ' ' -f2

Or simply directly separate fields with tabs, which is the default field separator for awk, as suggested by William Pursell:

ls | xargs -I. git -C . remote -vv | awk '{ print $2; }'
5
Léa Gris On

Rather than parsing the output of ls and git remote -vv. Use proper shell loop and Git remote commands:

for d in ./*/.git/..; do git -C "$d" remote get-url origin; done

As the remote name can be anything, one may opt to use the value from the remote.pushdefault config key:

for d in ./*/.git/..; do
  git -C "$d" remote get-url "$(git -C "$d" config remote.pushdefault)"
done

If the config entry is not available; the first remote might be used alternatively.

Either with:

for d in ./*/.git/..; do
  git -C "$d" remote get-url "$(git -C "$d" remote | head -1)"
done

or with:

for d in ./*/.git/..; do
  git -C "$d" remote -v | {
    read -r _ u _;
    printf %s\\n "$u"
  }
done