HANA SQL Regex finder

303 Views Asked by At

Is it possible to get a regex expression back just providing the column name so that HANA database has a look for the most frequent regex itself and returns this?

For example the column values are 1, 2, 3, 4, 55, nAn I would expect [1-9] as a result. Thanks

1

There are 1 best solutions below

0
Mathias Kemeter On

So your input is 1,2,3,4,5 and you are expecting [1-9]. Wouldn't [1-5] be a better fit? Or does [1|2|3|4|5] cover your data space best? What happens if you add a 8? Would it rather be [1-8] or [1|2|3|4|5|8]. The latter is more optimal in terms of coverage of your observed data.

If you are looking for a regular expression that minimally covers your data records a expression of type [record_1|...|record_n] is always a good choice. However, this is probably not what you want.

Finding a suitable (=minimal in terms of scope and readability) regex based on input data is very difficult or even impossible since it depends on a subjective quantification of data coverage vs. readability. As a consequence there is no such functionality in SAP HANA (and probably also not in other dbms).

What you can do is to define a set of regexes, that you would expect (e.g. one for numbers, one for text, one for dates, etc.) and programmatically determine the one that best fits your input data.