I use PostgreSQL 10 and I run CREATE EXTENSION unaccent; succesfully. I have a plgsql function that contains the following
whereText := 'lower(unaccent(place.name)) LIKE lower(unaccent($1))';
later, according to what user chose, more clauses may be added to the whereText.
The whereText is finally used in the query:
placewithkeys := '%'||placename||'%';
RETURN QUERY EXECUTE format('SELECT id, name FROM '||fromText||' WHERE '||whereText)
USING placewithkeys , event, date;
The whereText := 'LOWER(unaccent(place.name)) LIKE LOWER(unaccent($1))'; does not work, even if I remove the LOWER part.
I do select __my_function('Τζι'); and I get nothing back, even though I should get back results, because in the database there is the name Τζίμα
If I remove the unaccent and leave the LOWER it works, but not for accents : τζ brings Τζίμα back as it should. It seems like the unaccent is causing a problem.
What am I missing? How can I fix this?
Since there were comments about the syntax and possible SQLi , I provide the whole function definition, now changed to work accent-insensitive and case-insensitive in Greek:
CREATE FUNCTION __a_search_place
(placename text, eventtype integer, eventdate integer, eventcentury integer, constructiondate integer, constructioncentury integer, arstyle integer, artype integer)
RETURNS TABLE
(place_id bigint, place_name text, place_geom geometry)
AS $$
DECLARE
selectText text;
fromText text;
whereText text;
usingText text;
placewithkeys text;
BEGIN
fromText := '
place
JOIN cep ON place.id = cep.place_id
JOIN event ON cep.event_id = event.id
';
whereText := 'unaccent(place.name) iLIKE unaccent($1)';
placewithkeys := '%'||placename||'%';
IF constructiondate IS NOT NULL OR constructioncentury IS NOT NULL OR arstyle IS NOT NULL OR artype IS NOT NULL THEN
fromText := fromText || '
JOIN construction ON cep.construction_id = construction.id
JOIN construction_atype ON construction.id = construction_atype.construction_id
JOIN construction_astyle ON construction.id = construction_astyle.construction_id
JOIN atype ON atype.id = construction_atype.atype_id
JOIN astyle ON astyle.id = construction_astyle.astyle_id
';
END IF;
IF eventtype IS NOT NULL THEN
whereText := whereText || 'AND event.type = $2 ';
END IF;
IF eventdate IS NOT NULL THEN
whereText := whereText || 'AND event.date = $3 ';
END IF;
IF eventcentury IS NOT NULL THEN
whereText := whereText || 'AND event.century = $4 ';
END IF;
IF constructiondate IS NOT NULL THEN
whereText := whereText || 'AND construction.date = $5 ';
END IF;
IF constructioncentury IS NOT NULL THEN
whereText := whereText || 'AND construction.century = $6 ';
END IF;
IF arstyle IS NOT NULL THEN
whereText := whereText || 'AND astyle.id = $7 ';
END IF;
IF artype IS NOT NULL THEN
whereText := whereText || 'AND atype.id = $8 ';
END IF;
whereText := whereText || '
GROUP BY place.id, place.geom, place.name
';
RETURN QUERY EXECUTE format('SELECT place.id, place.name, place.geom FROM '||fromText||' WHERE '||whereText)
USING placewithkeys, eventtype, eventdate, eventcentury, constructiondate, constructioncentury, arstyle, artype ;
END;
$$
LANGUAGE plpgsql;
Postgres 12
unaccent()now works for Greek letters, too. Diacritic signs are removed:db<>fiddle here
Quoting the release notes:
Postgres 11 or older
unaccent()does not yet work for Greek letters. The call:... returns all letters unchanged, no diacritic signs removed as we would expect.
(I extracted this list from the Wikipedia page on Greek diacritics.)
db<>fiddle here
Looks like a shortcoming of the unaccent module. You can extend the default
unaccentdictionary or create your own. There are instructions in the manual. I created several dictionaries in the past and it's simple. And you are not to first to need this:Postgres unaccent rules for greek characters:
Unaccent rules plus greek characters for Postgres 9.6:
You need write access to the file system of the server, though - the directory containing the unaccent files. So, not possible on most cloud services ...
Or you might report a bug and ask to include Greek diacritic signs.
Aside: Dyamic SQL and SQLi
The code fragments you presented are not vulnerable to SQL injection.
$1is concatenated as literal string and only resolved in theEXECUTEcommand later, where the value is safely passed with theUSINGclause. So, no unsafe concatenation there. I would do it like this, though:Notes:
Less confusing - your original even confused Pavel in the comments, a professional in the field.
Assignments in plpgsql are slightly expensive (more so than in other PL), so adopt a coding style with few assignments.
Concatenate the two
%symbols forLIKEinto the main query directly, giving the query planner the information that the pattern is not anchored to start or end, which may help a more efficient plan. Only the user input is (safely) passed as variable.Since your
WHEREclause references tableplace, TheFROMclause needs to include this table anyway. So you cannot concatenate the FROM clause independently to begin with. Probably better to keep it all in a singleformat().Use dollar-quoting so you don't have to escape single quotes additionally.
Maybe just use
ILIKEinstead oflower(...) LIKE lower(...). If you work with trigram indexes (like would seem best for this query): those work withILIKEas well:I assume you are aware that you may need to escape characters with special meanings in
LIKEpattern?Audited function
After you provided your complete function ...
This is a complete rewrite with several improvements. Should make the function considerably. Also SQLi-safe (like your original). Should be functionally identical except the cases where I join fewer tables, which might not filter rows that are filtered by joining to the tables alone.
Major features:
Use
EXISTS()instead of lots of joins in the outer level plusGROUP BY. This contributes the lion share to the better performance. Related:format()is typically a good choice to concatenate SQL from user input. But since you encapsulated all code elements and only pass flags, you don't need it in this case. Instead,concat_ws()is of help. Related:Only concatenate JOINs you actually need.
Fewer assignments, shorter code.
Default values for parameters. Allows simplified call with missing parameters. Like:
Related:
About the short
ROW()syntax for testing whether any value isNOT NULL: