How to compute queryId from pg_stat_statements on PostgreSQL?

1.1k Views Asked by At

PostgreSQL has these monitoring tables, in particular pg_stat_statements, that contians this column called queryId and I was wondering how it is computed. If anybody knows where can I find the source code, that would be greatly appreciated.

So the pg_stat_statements tables would show something like:

userid | dbid | queryid |      query       | other statistics related columns 
1      |  2   | 123     | SELECT * FROM a; | ...

I am interested on how this 123 is being calculated.

1

There are 1 best solutions below

2
On

This query ID is computed by the pg_stat_statements extension, see contrib/pg_stat_statements/pg_stat_statements.c

This comment explains how it works:

As of Postgres 9.2, this module normalizes query entries.  Normalization
is a process whereby similar queries, typically differing only in their
constants (though the exact rules are somewhat more subtle than that) are
recognized as equivalent, and are tracked as a single entry.  This is
particularly useful for non-prepared queries.

Normalization is implemented by fingerprinting queries, selectively
serializing those fields of each query tree's nodes that are judged to be
essential to the query.  This is referred to as a query jumble.  This is
distinct from a regular serialization in that various extraneous
information is ignored as irrelevant or not essential to the query, such
as the collations of Vars and, most notably, the values of constants.

This jumble is acquired at the end of parse analysis of each query, and
a 32-bit hash of it is stored into the query's Query.queryId field.
The server then copies this value around, making it available in plan
tree(s) generated from the query.  The executor can then use this value
to blame query costs on the proper queryId.