Use find to get all folders that don't have a .git subfolder

174 Views Asked by At

How to use find to get all folders that have not a .git folder?

On this structure::

$ tree -a -d -L 2
.
├── a
│   └── .git
├── b
│   ├── b1
│   └── b2
├── c
└── d
    └── .git
        ├── lkdj
        └── qsdqdf

This::

$ find . -name ".git"  -prune -o -type d -print
.
./a
./b
./b/b1
./b/b2
./c
./d
$

get all folders except .git

I would like to get this::

$ find . ...
.
./b
./b/b1
./b/b2
./c
$
3

There are 3 best solutions below

0
On BEST ANSWER

It's inefficient (runs a bunch of subprocesses), but the following will do the job with GNU or modern BSD find:

find . -type d -exec test -d '{}/.git' ';' -prune -o -type d -print

If you're not guaranteed to have a find with any functionality not guaranteed in the POSIX standard, then you might need to take even more of an efficiency loss (to make {} its own token, rather than a substring, by having a shell run the test):

find . -type d -exec sh -c 'test -d "$1/.git"' _ '{}' ';' -prune -o -type d -print

This works by using -exec as a predicate, running a test that find doesn't have support for built-in.

Note the use of the inefficient -exec [...] {} [...] \; rather than the more efficient -exec [...] {} +; as the latter passes multiple filenames to each invocation, it has no way to get back individual per-filename results and so always evaluates as true.

0
On

In case you want to find only the top directories add the option -maxdepth 1 like

$ find . -type d -exec test -d '{}/.git' ';' -maxdepth 1 -prune -o -type d -print
.
./b
./c
$
1
On

If you don't mind using a temporary file, then:

find . -type d -print > all_dirs
fgrep -vxf <(grep '/\.git$' all_dirs | sed 's#/\.git$##') all_dirs | grep -vE '/\.git$|/\.git/'
rm all_dirs
  • The first step gets all subdirectory paths into all_dirs file
  • The second steps filters out the directories that have a .git subdirectory as well as the .git subdirectories. The -x option is necessary because we need to eliminate only the lines that match in entirety.

This will be a little more efficient compared to Charles' answer in that it doesn't run so many subprocesses. However, it would give a wrong output if any of the directories have a newline character in them.