I am trying to extract JPA named parameters in Javasacript. And this is the algorithm that I can think of
const notStrRegex = /(?<![\S"'])([^"'\s]+)(?![\S"'])/gm
const namedParamCharsRegex = /[a-zA-Z0-9_]/;
/**
* @returns array of named parameters which,
* 1. always begins with :
* 2. the remaining characters is guranteed to be following {@link namedParamCharsRegex}
*
* @example
* 1. "select * from a where id = :myId3;" -> [':myId3']
* 2. "to_timestamp_tz(:FROM_DATE, 'YYYY-MM-DD\"T\"HH24:MI:SS')" -> [':FROM_DATE']
* 3. "TO_CHAR(ep.CHANGEDT,'yyyy=mm-dd hh24:mi:ss')" -> []
*/
export function extractNamedParam(query: string): string[] {
return (query.match(notStrRegex) ?? [])
.filter((word) => word.includes(':'))
.map((splittedWord) => splittedWord.substring(splittedWord.indexOf(':')))
.filter((splittedWord) => splittedWord.length > 1) // ignore ":"
.map((word) => {
// i starts from 1 because word[0] is :
for (let i = 1; i < word.length; i++) {
const isAlphaNum = namedParamCharsRegex.test(word[i]);
if (!isAlphaNum) return word.substring(0, i);
}
return word;
});
}
I got inspired by the solution in https://stackoverflow.com/a/11324894/12924700 to filter out all characters that are enclosed in single/double quotes.
While the code above fulfilled the 3 use cases above. But when a user input
const testStr = '"user input invalid string \' :shouldIgnoreThisNamedParam \' in a string"'
extractNamedParam(testStr) // should return [] but it returns [":shouldIgnoreThisNamedParam"] instead
I did visit the source code of hibernate to see how named parameters are extracted there, but I couldn't find the algorithm that is doing the work. Please help.
You can use
Get the Group 1 values only. See the regex demo. The regex matches strings between single/double quotes and captures
:
+ one or more word chars in all other contexts.See the JavaScript demo:
Details:
"[^\\"]*(?:\\[\w\W][^\\"]*)*"
- a"
, then zero or more chars other than"
and\
([^"\\]*
), and then zero or more repetitions of any escaped char (\\[\w\W]
) followed with zero or more chars other than"
and\
, and then a"
|
- or'[^\\']*(?:\\[\w\W][^\\']*)*'
- a'
, then zero or more chars other than'
and\
([^'\\]*
), and then zero or more repetitions of any escaped char (\\[\w\W]
) followed with zero or more chars other than'
and\
, and then a'
|
- or(:\w+)
- Group 1 (this is the value we need to get, the rest is just used to consume some text where matches must be ignored): a colon and one or more word chars.