calculate hamming distance and weight in sqlite

1.7k Views Asked by At

Is there a good way to calculate hamming distance and weight in sqlite? It supports bit-wise operators but I want to order results based on hamming weight, and there is no support for bitcount in sqlite.

To be more elaborate, let's say I have those rows: 1011 1000 1100 0011 and given the 1st row (1011) I would like to get as a result last row (0011) which has the most 1s if you AND them.

In my case the numbers will be about 650 digits long, and I have about 3500 rows.

I've found this solution which works with blobs of text, but I would like something more optimal:

create table ts (x blob); 
insert into ts(x) values ('00010');
...
select x & '10011', length(replace( x & '10011','0','')) as weight from ts;
1

There are 1 best solutions below

1
On

SQLite has no built-in functions that could directly help with this.

In SQLite 3.8.3 or later, you could use a recursive common table expression to compute the matches by hand:

CREATE TABLE t(x);
INSERT INTO t VALUES ('1011'), ('1000'), ('1100'), ('0011');

WITH compare(matches, rest, pattern, original) AS (
  SELECT 0, x, '1011', x FROM t
  UNION ALL
  SELECT matches + (substr(rest, 1, 1) = '1' AND substr(pattern, 1, 1) = '1'),
         substr(rest, 2),
         substr(pattern, 2),
         original
  FROM compare
  WHERE rest != '')
SELECT matches, original
FROM compare
WHERE rest = ''
ORDER BY matches DESC;

3|1011
2|0011
1|1000
1|1100