Scan Git Repository for Statistics

3.1k Views Asked by At

How can I get some sort of statistics of my Git repository?

I am currently hosting the Git repository in BitBucket and wanted to find the following details:

  • Total number of commits
  • Used Programming Languages
  • Lines of code in total for each Programming Language

Do you think this is achievable? Or am I asking for too much. There maybe a clever tool that I am not aware of.

Also using SourceTree to push and pull code, if that helps.

Thank you in advance.

2

There are 2 best solutions below

1
On

Number of commits

I would recommend one of these two

  • git rev-list --count origin/master for just the master branch
  • git rev-list --all --count for all branches

As somebody mentioned, git log --oneline | wc -l will give you number of commits, except that's only for the current branch. To use git log --oneline, you will need to do it for all branches to get the total number of commits for all branches. You can't iterate because many commits will be counted multiple times so you must take all heads (or perhaps refs) and generate a single expression to do a log from all of them.

Languages and lines of code

Use the cloc tool to get all that.

1
On

Total number of commits

Easy one. git rev-list --count master. Obviously, you can count commits in other branches.

Number of Programming Languages

You can't say the number for sure, but you can count them roughly by grouping and counting files by their extensions. However, *.h files are used for C, C++ and Objective-C (not sure for last one). Quick googling:

find . -type f -printf "%f\n" | grep -io '\.[^.]*$' | sort | uniq -c | sort -rn
24 .kt
20 .java
12 .gradle
 9 .sample
 8 .properties
 7 .xml
 7 .jar
 6 .bat
 4 .yml
 3 .sql
 3 .md
 3 .gitignore
 1 .yaml
 1 .xz
 1 .scala
 1 .PKGINFO
 1 .pack
 1 .MTREE
 1 .idx
 1 .go

Well, as you see there are definitely Kotlin, Java and Scala here. Also, one Go file. Other files are just litter.

Lines of code per Programming Language

Extending the previous one-liner:

find . -type f -printf "%f\n" | grep -io '\.[^.]*$' | sort | uniq | xargs printf "*%s\n" $1 | xargs -i sh -c 'echo "{}: $(find . -name "{}" -print0 | xargs -0 cat | wc -l)"'
*.yml: 64
*.yaml: 44
*.xz: 1568
*.xml: 121
*.sql: 38
*.scala: 36
*.sample: 496
*.properties: 43
*.PKGINFO: 23
*.pack: 14416
*.MTREE: 3
*.md: 12
*.kt: 388
*.java: 489
*.jar: 16064
*.idx: 34
*.gradle: 126
*.go: 9
*.gitignore: 11
*.bat: 540

Well, I don't encourage you to use bash oneliners, as they are completely unreadable.