Remove Multiline Comments only from Top of Every Java File

291 Views Asked by At

We once used borland starteam tool (one of the kind of revision/source code control system like mercurial) for our code management. Whenever we commit the code, the tool itself puts a description of the commit at the top of the file. So now we have many classes in the code where at the top of each file. For example:

/*This is some developer comment at the top of the file*/

 * $Log:
 *  1   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid did something
 *  2   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid again did 
 *                                             something
 * $

public class ABC
  /*This is just a variable*/
  int a = 0;
  public int method1()

Now i am planning to remove all this starteam type of the code which is present at the top of each file. But i dont want to remove any other comment from any file or any other copyright comment at the top. I only want to remove that chunk that starts with $Log and ends with $. I have looked at other questions as well related to this problem but this is a multiline comment. Would regular expression be good option for this?

Is there any utility i can use rather then writing my own code to remove this?

If regular expression is the only quick solution, then i am stuck in there.

Any help would be appreciated.


There are 1 best solutions below


If the format is exactly as you show, you could build a fragile little state machine that looks like this.

Start with an enum to track the state:

enum ParseState
    MayBeInMultiLineComment,    //occurs after initial /*

and then add this code:

     public static void CommentStripper()
         var text = @"/*This is some developer comment at the top of the file*/
 * $Log:
 *  1   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid did something
 *  2   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid again did 
 *                                             something
 * $

    This is not a log entry

public class ABC
  /*This is just a variable*/
  int a = 0;
  public int method1()

    //this next line could be File.ReadAllLines to get the text from a file
    //or you could read from a stream, line by line.

    var lines = text.Split(new[] {"\r\n"}, StringSplitOptions.None);

    var buffer = new StringBuilder();
    ParseState parseState = ParseState.Normal;
    string lastLine = string.Empty;

    foreach (var line in lines)
        if (parseState == ParseState.Normal)
            if (line == "/*")
                lastLine = line;
                parseState = ParseState.MayBeInMultiLineComment;
        else if (parseState == ParseState.MayBeInMultiLineComment)
            if (line == " * $Log:")
                parseState = ParseState.InMultilineComment;
                parseState = ParseState.Normal;
            lastLine = string.Empty;
        else if (parseState == ParseState.InMultilineComment)
            if (line == " */")
                parseState = ParseState.Normal;

    //you could do what you want with the string, I'm just going to write it out to the debugger console.

Note the lastLine is used because you need to read-ahead one line to pick up whether a comment is a log entry or not (which is what the MayBeInMultiLineComment state tracks).

The output from that looks like:

/*This is some developer comment at the top of the file*/

    This is not a log entry

public class ABC
  /*This is just a variable*/
  int a = 0;
  public int method1()