Counting lines of code per author in a git repository

2.5k Views Asked by At

So I'm in a team with a few other programmers and need to get a lines-of-code count per author in our git repository. That doesn't just mean lines modified by author, because that would include blank and comment lines. Ideally, I would be able to make a new branch containing only the commits of a specific author (--author="BtheDestroyer" for myself) and then use cloc to get the comment line count and code line counts separately. I've tried using

git log --author="BtheDestroyer" --format=%H > mycommits
git checkout --orphan mycommits
tac mycommits| while read sha; do git cherry-pick --no-commit ${sha}; done

during the last line, however, I get a ton of the following errors:

filepath: unmerged (commit-id-1)
filepath: unmerged (commit-id-2)
error: your index file is unmerged.
fatal: cherry-pick failed

I'm also not sure if that will end up fastforwarding through other commits in the process. Any ideas?

2

There are 2 best solutions below

0
On BEST ANSWER

Answering my own question:

I ended up using git blame and a Bourne shell script to loop through different files in the source folder, convert them back to code using grep and cut, organize the output into temporary files, and then run cloc on that.

Here's my shell script for anyone wanting to do something similar (I have it in ./Blame/ so change SOURCE appropriately!):

#!/bin/bash
#Name of user to check
#  If you have multiple usernames, separate them with a space
#  The full name is not required, just enough to not be ambiguous
USERS="YOUR USERNAMES HERE"
#Directories
SOURCE=../source

for USER in $USERS
do
    #clear blame files
    echo "" > $USER-Blame.h
    echo "" > $USER-Blame.cpp
    echo "" > $USER-Blame.sh
    echo "Finding blame for $USER..."
    #C++ files
    echo "  Finding blame for C++ files..."
    for f in $SOURCE/*.cpp
    do
        git blame "$f" | grep "$USER" | cut -c 70- >> "$USER-Blame.cpp"
    done
    #Header files
    echo "  Finding blame for Header files..."
    for f in $SOURCE/*.h
    do
        git blame "$f" | grep "$USER" | cut -c 70- >> "$USER-Blame.h"
    done
    #Shell script files
    echo "  Finding blame for shell script files..."
    for f in ./GetUSERBlame.sh
    do
        git blame "$f" | grep "$USER" | cut -c 70- >> "$USER-Blame.sh"
    done
done

for USER in $USERS
do
#cloc
echo "Blame for all users found! Cloc-ing $USER..."
cloc $USER-Blame.* --quiet
#this line is for cleaning up the temporary files
#if you want to save them for future reference, comment this out.
rm $USER-Blame.* -f
done
0
On

here's my one-liner:

function gitfilecontributors() { local perfile="false" ; if [[ $1 = "-f" ]]; then perfile="true" ; shift ; fi ; if [[ $# -eq 0 ]]; then echo "no files given!" >&2 ; return 1 ; else local f ; { for f in "$@"; do echo "$f" ; git blame --show-email "$f" | sed -nE 's/^[^ ]* *.<([^>]*)>.*$/: \1/p' | sort | uniq -c | sort -r -nk1 ; done } | if [[ "$perfile" = "true" ]]; then tee /tmp/gitblamestats.txt ; else tee /tmp/gitblamestats.txt >/dev/null ; fi ; echo ; echo "total:" ; awk -v FS=' *: *' '/^ *[0-9]/{sums[$2] += $1} END { for (i in sums) printf("%7s : %s\n", sums[i], i)}' /tmp/gitblamestats.txt | sort -r -nk1 ; fi ; }

or with line breaks:

gitfilecontributors ()
{
    local perfile="false";
    if [[ $1 = "-f" ]]; then
        perfile="true";
        shift;
    fi;
    if [[ $# -eq 0 ]]; then
        echo "no files given!" 1>&2;
        return 1;
    else
        local f;
        {
            for f in "$@";
            do
                echo "$f";
                git blame --show-email "$f" | sed -nE 's/^[^ ]* *.<([^>]*)>.*$/: \1/p' | sort | uniq -c | sort -r -nk1;
            done
        } | if [[ "$perfile" = "true" ]]; then
            tee /tmp/gitblamestats.txt;
        else
            tee /tmp/gitblamestats.txt > /dev/null;
        fi;
        echo;
        echo "total:";
        awk -v FS=' *: *' '/^ *[0-9]/{sums[$2] += $1} END { for (i in sums) printf("%7s : %s\n", sums[i], i)}' /tmp/gitblamestats.txt | sort -r -nk1;
    fi
}

usage possible four folder(s) of your choice.

option -f to show per file, otherwise totals only:

$ gitfilecontributors    $(fd --type f '.*' source)
total:
    139 : [email protected]
     29 : [email protected]
      9 : [email protected]
gitfilecontributors -f $(fd --type f '.*' source)
source/040_InitialSetup.md
     80 : [email protected]
     29 : [email protected]
      6 : [email protected]
README.md
     59 : [email protected]
      5 : [email protected]
      3 : [email protected]

total:
    139 : [email protected]
     29 : [email protected]
      9 : [email protected]
      5 : [email protected]