How to generate unique timestamps in PostgreSQL?

4.2k Views Asked by At

My idea is to implement a basic «vector clock», where a timestamps are clock-based, always go forward and are guaranteed to be unique.

For example, in a simple table:

CREATE TABLE IF NOT EXISTS timestamps (
    last_modified TIMESTAMP UNIQUE
);

I use a trigger to set the timestamp value before insertion. It basically just goes into the future when two inserts arrive at the same time:

CREATE OR REPLACE FUNCTION bump_timestamp()
RETURNS trigger AS $$
DECLARE
    previous TIMESTAMP;
    current TIMESTAMP;
BEGIN
     previous := NULL;
     SELECT last_modified INTO previous
      FROM timestamps
     ORDER BY last_modified DESC LIMIT 1;

     current := clock_timestamp();
     IF previous IS NOT NULL AND previous >= current THEN
        current := previous + INTERVAL '1 milliseconds';
     END IF;
     NEW.last_modified := current;
     RETURN NEW;
END;
$$ LANGUAGE plpgsql;

DROP TRIGGER IF EXISTS tgr_timestamps_last_modified ON timestamps;

CREATE TRIGGER tgr_timestamps_last_modified
BEFORE INSERT OR UPDATE ON timestamps
FOR EACH ROW EXECUTE PROCEDURE bump_timestamp();

I then run a massive amount of insertions in two separate clients:

DO
$$
BEGIN
    FOR i IN 1..100000 LOOP
       INSERT INTO timestamps DEFAULT VALUES;
    END LOOP;
END;
$$;

As expected, I get collisions:

ERROR: duplicate key value violates unique constraint "timestamps_last_modified_key"
État SQL :23505
Détail :Key (last_modified)=(2016-01-15 18:35:22.550367) already exists.
Contexte : SQL statement "INSERT INTO timestamps DEFAULT VALUES"
PL/pgSQL function inline_code_block line 4 at SQL statement

@rach suggested to mix current_clock() with a SEQUENCE object, but it would probably imply getting rid of the TIMESTAMP type. Even though I can't really figure out how it'd solve the isolation problem...

Is there a common pattern to avoid this?

Thank you for your insights :)

2

There are 2 best solutions below

2
On

If you have only one Postgres server as you said, I think that using timestamp + sequence can solve the problem because sequence are non transactional and respect the insert order. If you have db shard then it will be much more complex but maybe the distributed sequence of 2ndquadrant in BDR could help but I don't think that ordinality will be respected. I added some code below if you have setup to test it.

CREATE SEQUENCE "timestamps_seq";

-- Let's test first, how to generate id.
SELECT extract(epoch from now())::bigint::text || LPAD(nextval('timestamps_seq')::text, 20, '0') as unique_id ;

           unique_id
--------------------------------
 145288519200000000000000000010
(1 row)


CREATE TABLE IF NOT EXISTS timestamps (
    unique_id TEXT UNIQUE NOT NULL DEFAULT extract(epoch from now())::bigint::text || LPAD(nextval('timestamps_seq')::text, 20, '0')
);


INSERT INTO timestamps DEFAULT VALUES;
INSERT INTO timestamps DEFAULT VALUES;
INSERT INTO timestamps DEFAULT VALUES;

select * from timestamps;
           unique_id
--------------------------------
 145288556900000000000000000001
 145288557000000000000000000002
 145288557100000000000000000003
(3 rows)

Let me know if that works. I'm not a DBA so maybe it will be good to ask on dba.stackexchange.com too about the potential side effect.

0
On

My two cents (Inspired from http://tapoueh.org/blog/2013/03/15-batch-update).

try adding the following before massive amount of insertions:

LOCK TABLE timestamps IN SHARE MODE;

Official documentation is here: http://www.postgresql.org/docs/current/static/sql-lock.html