Global sequential number generator without using a relational database

3.4k Views Asked by At

I have an application. Suppose it's an invoice service. Each time a user creates an invoice I need to assign the next sequential number (I.e: ISequentialNumberGeneratorRepository.Next(); So essentially the invoice number must be unique despite having several instances of my application running (horizontal scalability is likely in the future).

In other words, I need a global sequential number generator.

Traditionally this problem is resolved by using a relational database such as SQL server, PostgreSQL, MySQL, etc. because these systems have the capability to generate sequential unique IDs on inserting a record and returning the generated id as part of the same atomic operation, so they're a perfect fit for a centralised sequential number generator.

But I don't have a relational database and I don't need one, so it's a bit brutal having to use one just for this tiny functionality.

I have, however, an EventStore available (EventStore.org) but I couldn't find out whether it has sequential number generation capability.

So my question is: Is there any available product out there which I could use to generate unique sequential numbers so that I can implement my Next(); repository's method with, and which would work well independently of how many instances of my client invoice application I have?

Note: Alternatively, if someone can think of a way to use EventStore for this purpose or how did they achieve this in a DDD/CQRS/ES environment it'd also be great.

3

There are 3 best solutions below

0
On BEST ANSWER

You have not stated the reasons(or presented any code) as to why you want this capability. I will assume the term sequential should be taken as monotonically increasing(sorting not looping).

I tend to agree with A.Chiesa, I would add timestamps to the list, although not applicable here.

Since your post does not indicate how the data is to be consumed, I purpose two solutions, the second preferred over the first, if possible; and for all later visitors, use a database solution instead.

The only way to guarantee numerical order across a horizontally scaled application without aggregation, is to utilize a central server to assign the numbers(using REST or RPCs or custom network code; not to mention an SQL server, as a side note). Due to concurrency, the application must wait it's turn for the next number and including network usage and delay, this delay limits the scalability of the application, and provides a single point of failure. These risks can be minimized by creating multiple instances of the central server and multiple application pools(You will lose the global sorting ability).

As an alternative, I would recommend the HI/LO Assigning method, combined with batch aggregation. Each instance has a four? digit identifier prefixed to an incrementing number per instance. Schedule an aggregation task on a central(or more than one, for redundancy) server(s) to pickup the data and assign a sequential unique id during aggregation. This process localizes the data(until pickup, which could be scheduled for (100, 500, 1000)? millisecond intervals if needed for coherence; minutes or more ,if not), and provides almost perfect horizontal scaling, with the drawback of increased vertical scaling requirements at the aggregation server(s).

Distributed computing is a balancing act, between processing, memory, and communication overhead. Where your computing/memory/network capacity boundaries lie cannot be determined from your post.

There is no single correct answer. I have provided you with two possibilities, but without specific requirements of the task at hand, I can go no further.

14
On

IMHO, your requirement is kinda flawed, because you have conflicting needs.

You want a unique id. The usual solutions use:

  • guid. Can be generated centrally or locally. Really easy to implement. Kinda hard for a human reader, but YMMV. But you want incremental keys.
  • centrally assigned key: you need a transactional system. But you want to do CQRS, and use Event Store. It seems to me that having a separate transactional system just to have an IDENTITY_COLUMN or a SEQUENCE largely misses the point of doing CQRS.
  • use an HiLo generation approach. That is: every single client gets a unique seed (like 1 billion for the first client, 2 billions for the second, etc). So each client can generate locally a sequence. This sequence is distributed and uses sequential numbers, so there is no concurrency problems, but there is no global sorting for requests and you must ensure that no two clients get the same Hi value (relatively easy task).
  • use the id assigned by Event Store. I don't know the product, but every event sent to the queue gets a unique id. But (as I understand it) you require the id to be available BEFORE sending the event.

You can generally mix-and-match either of this solutions (especially the Hilo algorithm) with timestamps (like seconds from Unix Epoch, or something alike), in order to produce a (weak, non guaranteed) sortability. But generally I would avoid this, because if you generate ids on multiple sites, you introduce the risk of the clocks being unsynchronized, and generally other unsolved (or unsolvable) problems.

Probably I'm missing something, but this are the ones from the top of my head.

So, as far as i can tell, you are in an empasse. I would try really hard to put myself in one of the previous situations.

1
On

It is strange opinion

so it's a bit brutal having to use one just for this tiny functionality.

Today SQLite is used as relational database even in mobile phones. It is simple, have small memory footprint and have binding for all popular programming languages. 20 years ago databases consumed many resources - today you can find database engine for all tasks. Also, if you need tiny key-pair store you can use BerkeleyDB.