How can I capture certain data using regex if it is dependent on another field?

42 Views Asked by At

I need help in writing regex for the below mentioned log:

URLReputation: Risk unknown, URL: http://facebook.com

I wrote a regex like below:

URLReputation\:\s*(.*?),\s*URL\:\s*(.*)

Here everything is working. But in case URL isn't there, the URLReputation also will not be captured.

Please help.

Regards,

Mitesh Agrawal

2

There are 2 best solutions below

0
The fourth bird On BEST ANSWER

You could turn the non greedy .*? into a negated character class [^,]+ and match any char except a comma. Then make the URL part optional using an optional non capturing group (?:...)?

You want to capture the value of a url using .* but that could possibly also match an empty string.

You might make the pattern more specific by matching at least a single non whitespace char \S+ or use a pattern like for example specifying the start https?://\S+

URLReputation:\s*([^,]+)(?:,\s*URL:\s*(\S+))?

Regex demo

0
Cary Swoveland On

Assuming the string ends immediately before the comma when the "URL isn't there", you can simply put the comma and what follows in an optional non-capture group and add an end-of-line anchor:

/URLReputation: +(.*?)(?:, +URL:\ +(.*))?$/

Demo

Mainly to improve readability, I changed each \s to a space as it appears that spaces are the only whitespace characters you wish to match.