Why doesn't my DISTINCT ON expression work?

63 Views Asked by At

Query:

SELECT DISTINCT ON (geom_line),gid 
FROM edge_table;

I have a edge table which contains duplicates and I want to remove duplicate edges keeping one of them, but the syntax itself is wrong?

1

There are 1 best solutions below

2
On BEST ANSWER

The comma is the problem.

If you want geom_line included in the result, use

SELECT DISTINCT ON (geom_line) geom_line, gid FROM edge_table;

Else use

SELECT DISTINCT ON (geom_line) gid FROM edge_table;

But if your objective is just to remove duplicates, I'd say that you should use

SELECT DISTINCT geom_line, gid FROM edge_table;

DISTINCT guarantees uniqueness over the whole result set, while DISTINCT ON guarantees uniqueness over the expression in parentheses. If there are several rows where the expression in parentheses is identical, one of these rows is picked. If you have an ORDER BY clause, the first row will be picked.

DISTINCT a, b is the same as DISTINCT ON (a, b) a, b.