What is the algorithm behind Cassandra's token function?

631 Views Asked by At

The Token function in my driver doesn't support a composite partition key, but it works very well with a single partition key, it takes a binary in 8 bits form as an input and pass it to murmur3 hash function and extract the 64-signed-little-integer (Token) from the result of murmur3 and ignore any extra binary buffer.

So my hope is to generate the binary for a composite partition key and then pass it to murmur3 as usual, an algorithm or bitwise operations will be really helpful or at least a source in any programming language.

I don't mean murmur3 part, only the token side which converts/mixes the composite partition key and outputs raw bytes in binary form.

2

There are 2 best solutions below

0
On BEST ANSWER

Finally I found a solution to my question :The Algorithm to compute the token for a composite partition key : Primary_key((text, int)) -> therefore the partition key is a composite_partition_key (text, int).

Example : a row with composite_partition_key ('hello', 1)

Applying the algorithm :

1- lay-out the components of the composite partition key in big-endian (16 bits) presentation :

first_component = 'hello' -> 68 65 6c 6c 6f

sec_component = 1 -> 00 00 00 01

68 65 6c 6c 6f 00 00 00 01

2- add two-bytes length of component before each component

first_component = 'hello', length= 5-> 00 05 68 65 6c 6c 6f

sec_component = 1, therefore length= 4 -> 00 04 00 00 00 01

00 05 68 65 6c 6c 6f 00 04 00 00 00 01

3- add zero-value after each component

first_component = 'hello' -> 00 05 68 65 6c 6c 6f 00

sec_component = 1 -> 00 04 00 00 00 01 00

4- result

00 05 68 65 6c 6c 6f 00 00 04 00 00 00 01 00

now pass the result as whatever binary base your murmur3 function understand (make sure it's cassandra variant).

4
On

Take a look at the drivers since they have generate the token to find the correct coordinator. https://github.com/datastax/java-driver/blob/8be7570a3c7fbba773ae2581bbf26e8196e7d6fb/driver-core/src/main/java/com/datastax/driver/core/Token.java#L112

Its slightly different than the typical murmur3 due to a bug when it was made and inability to change it without breaking existing clusters. So I would recommend copying it from them or better yet, use the existing drivers to find the token.