I have to validate a string that is separated by dashes into 3 parts like so:
'substring1-substring2-substring3'
Substrings 2 and 3 need to be validated to be within a list of accepted values for substring 2 and another list of accepted values for substring 3.
The issue I'm having is that substring 2 and substring 3 have values that contain dashes, so breaking apart the main string to access and validate substring 2 and 3 has become pretty difficult, and I'm not sure how else to tackle this.
I'm doing this in Databricks SQL.
So far, I've used the LIKE operator to select '% - % - %' which just validates that there are two dashes.
I've tried splitting the string by the "-" but because there are dashes in both substrings 2 and 3, I can't figure out a way to select for either of them.
This is how you can solve it and not a fully working solution, but I have no doubt it will work. I am assuming you can loop through the list of acceptable values for substring 2 and through the list of acceptable values for string 3 and concatenate them by
|
Alternatively, you can have twoacceptable_two
andacceptable_three
tables that you can aggregate in a string separated by|
By the end of this concatenation, you will have two strings looking like below:Then you can put together a regex like this:
Once you generated the regex validating will consist in only calling (or whatever the syntax might be):
Note I did not give you a working solution as I don't have a databricks-sql running environment, but it should be no problem for you to implement.