Regular expression: Combining two regex inside replaceAll() function in groovy

35 Views Asked by At

I have some test data which looks like below:

test_ids = [test_data_123, test_123, test_data_456]

I'm trying to pass them inside the replaceAll() method in Groovy, which will convert all the elements in the above array to string.

${__groovy(vars.get('test_ids').replaceAll(/\b((test_\d+)||(test_data_\d+))\b/) { match\, number -> "\"$number\"" },)}

Below one is the regex that I'm using:

(test_\d+)||(test_data\d+)

This doesn't seem to work. Can anyone provide a regex so that it chooses all of the elements?

1

There are 1 best solutions below

0
chubbsondubs On

So the issue is that List (ie [ "something here" ]) doesn't understand replaceAll. The replaceAll is a method on String (ie "something here"). So you'll need to loop over your List and invoke replaceAll on it.

But, here is the thing that I don't understand about your question what is the point of this code? It's unclear why you want to do this so I think if we understood that we could provide better help to get you to what you and trying to accomplish in a more straightforward easier to understand way. But, I'm just going to try and accomplish what I think you were after.

test_ids = ['test_data_123', 'test_123', 'test_data_456', 'fjkfkd']

println( test_ids.collect { 
    it.replaceAll(/\b(test_\d+)|(test_data_\d+)\b/) { match -> 
       int number = match[1..-1].findIndexOf { it } + 1
       "\"$number\""
    }
})

This will return print the following:

["2", "1", "2", fjkfkd]

So it returns the index number of the regular expression it matched. This will be group number. So in the input you provided the 1st item of the list (test_data_123) matches the 2nd group (ie 2), and the 2nd item (test_123) returns the 1st group (ie 1). The 3rd is just like the first one (test_data_456) so it's 2. But the last item doesn't match anything so nothing is replaced and it just returns the String that it was given. This wasn't in your sample input but what happens if nothing is matched? Something to consider in your algorithm.

The 1st change was to remove number as a parameter of the closure because the only parameter passed to the closure is the match group array (ie match). That would be something like this for the 1st item: [ "test_data_123", null, "test_data_123" ], and ["test_123", "test_123", null] for the 2nd. The findIndexOf returns the index that is not the 0th element (ie [1..-1]), and is non-null.

The other change was to remove || from the regular expression and replace it with just |. || is not a thing in regular expressions. If you want pattern A or pattern B it's A|B.

A couple of things to ask. Is replaceAll the right tool for this? I assume you want to know which regular expression each string matches. But because you're using replaceAll you have to return a String. However, you just want to know was it pattern 1 or pattern 2. So I'd probably do something like this:

println(test_ids.collect { s ->
   if( s.matches(~/\btest_\d+\b/) ) return 1
   if( s.matches(~/\btest_data_\d+\b/) ) return 2
   return -1
})

Then it's easy to figure out and use integer math to know which is which.