C# RegEx Replace While Ignoring Matches Found In String Quotes [RESOLVED]

100 Views Asked by At

I work in an application where system admins can create custom fields and configure calcs to resolve values in those fields. These calcs are saved as a string property of the CustomField object. I'm tasked with cleaning up these calcs in bulk, predominantly focused on swapping functions and conjunction operators with their short-circuiting variation. I'm looking to use RegEx.Replace() method to solve this task in .NET C# solution.

Example calculation string (users type calc in VB.NET syntax):

IIF([FIELD1] = "This and that" AND [FIELD2] = "This or that", "Value 1",
IIF ([FIELD3] = "This and that" OR [FIELD4] = "This or that", "Value 2", 
""))

Requirements of the task:

  1. Replacements are: IIF -> IF, And -> AndAlso, Or -> OrElse
  2. Lookups/matches need to be case-insensitive for all replacements being done, as different admins use different casing when writing syntax.
  3. For IIF function, need to find both "IIF(" (syntax with left parenthesis immediately following the "F"), and then also matches with any variable amount of whitespace between the "F" and left parenthesis. This is based on observations of how I'm seeing calcs written in my client's environment.
  4. For both AND/OR phrases, need to locate instances with a leading/trailing space, and ensure the phrase is not part of a quoted string. So in the above example calc string, the valid conjunction operators would get replaced, but the and/or strings in "This and that" & "This or that" would be ignored and skipped over.

Below is sample of what I have as a starting point. I believe I have the IIF function pattern correct by using the "alternate" | pipe character to do an either/or match for that phrase, but I'm stuck on how to setup the AND/OR patterns to exclude replacement for matches contained in strings.

foreach (CustomFieldInfo customField in Session.ConfigurationManager.GetCustomFields(false))
{
    string fieldCalcStr = customField.Calculation.Trim();
    if (string.IsNullOrEmpty(fieldCalcStr)) continue;
    fieldCalcStr = Regex.Replace(fieldCalcStr, @"IIF\(|IIF *\(", "IF(", RegexOptions.IgnoreCase);
    fieldCalcStr = Regex.Replace(fieldCalcStr, " AND ", " AndAlso ", RegexOptions.IgnoreCase);
    fieldCalcStr = Regex.Replace(fieldCalcStr, " OR ", " OrElse ", RegexOptions.IgnoreCase);
    customField.Calculation = fieldCalcStr;
}

NOTE: I've seen a variety of posts on here with people trying to accomplish something relatively similar, but in all cases the answers were hyper-specific to the input string the OP provided, and made assumptions that rendered the proposed answers not applicable to my user case and specific requirements.

Any help is much appreciated.

*** UPDATE ***

Was able to figure this out by using RegEx.Split() to return array of quoted and non-quoted strings in the same order as they appear in the original string, and then just perform replacements for each pattern if the array element doesn't start with a quote character:

Dictionary<string, string> patternReplacementDict = new Dictionary<string, string>();
patternReplacementDict.Add("(?:IIF\\s*[(])", "IF(");
patternReplacementDict.Add("(?: AND )", " AndAlso ");
patternReplacementDict.Add("(?: OR )", " OrElse ");
foreach (CustomFieldInfo customField in Session.ConfigurationManager.GetCustomFields(false))
{
    string calcStr = customField.Calculation.Trim();
    if (string.IsNullOrEmpty(calcStr)) continue;
    string[] calcStrArray = Regex.Split(calcStr, "(\"[^\"]+\")");
    string newCalcStr = string.Empty;
    foreach (string str in calcStrArray)
    {
        if (!str.StartsWith("\""))
        {
            string newStr = str;
            foreach (KeyValuePair<string, string> kvp in patternReplacementDict)
            {
                newStr = Regex.Replace(newStr, kvp.Key, kvp.Value, RegexOptions.IgnoreCase);
            }
            newCalcStr += newStr;
        }
        else
        {
            newCalcStr += str;
        }
    }
    customField.Calculation = newCalcStr;
}

Original Input

IIF([FIELD1] = "This ""quoted text inside string quotes"" and that" AnD [FIELD2] = "This or that", "Value 1",
iiF ([FIELD3] = "This and that" aNd [FIELD4] = "This or that", "Value 2", 
IIF([FIELD4] = "This and that" Or [FIELD5] = "This or that", "Value 3",
Iif   ([FIELD6] = "This and that" oR [FIELD7] = "This or that", "Value 4", 
""))))

Output

IF([FIELD1] = "This ""quoted text inside string quotes"" and that" AndAlso [FIELD2] = "This or that", "Value 1",
IF([FIELD3] = "This and that" AndAlso [FIELD4] = "This or that", "Value 2", 
IF([FIELD4] = "This and that" OrElse [FIELD5] = "This or that", "Value 3",
IF([FIELD6] = "This and that" OrElse [FIELD7] = "This or that", "Value 4", 
""))))
1

There are 1 best solutions below

0
Kirill Polishchuk On

Possibly you can create a single regex which will match a string within quotes or a search term (i.e. or, and, iif). Down below is an example for case-insensitive regex which matches either a string within quotes or an operator and replaces only if the match is an operator.

var result = Regex.Replace(s, "(?i)(?:\"[^\"]+\")|\b(and|or|iif)\b",
    m =>
    {
        if (m.Groups[1].Success)
        {
            var match = m.Groups[1].Value;

            if (StringComparer.OrdinalIgnoreCase.Equals(match, "iif"))
            {
                return "IF";
            }

            if (StringComparer.OrdinalIgnoreCase.Equals(match, "and"))
            {
                return "AndAlso";
            }

            if (StringComparer.OrdinalIgnoreCase.Equals(match, "or"))
            {
                return "OrElse";
            }
        }

        // string within quotes -- do nothing
        return m.Value;
    });