Delete multiple couchbase entities having common key pattern

2k Views Asked by At

I have a use case where I have to remove a subset of entities stored in couchbase, e.g. removing all entities with keys starting with "pii_". I am using NodeJS SDK but there is only one remove method which takes one key at a time: http://docs.couchbase.com/sdk-api/couchbase-node-client-2.0.0/Bucket.html#remove

In some cases thousands of entities need to be deleted and it takes very long time if I delete them one by one especially because I don't keep list of keys in my application.

2

There are 2 best solutions below

6
On BEST ANSWER

The best way to accomplish this is to create a Couchbase view by key and then range query over that view via your NodeJS code, making deletes on the results.

For example, your Couchbase view could look like the following:

function(doc, meta) {
    emit(meta.id, null);
}

Then in your NodeJS code, you could have something that looks like this:

var couchbase = require('couchbase');
var ViewQuery = couchbase.ViewQuery;

var query = ViewQuery.from('designdoc', 'by_id');

query.range("pii_", "pii_" + "\u0000", false);

var myBucket = myCluster.openBucket();
myBucket.query(query, function(err, results) {
    for(i in results) {
        // Delete code in here
    }
});

Of course your Couchbase design document and view will be named differently than the example that I gave, but the important part is the ViewQuery.range function that was used.

All document ids prefixed with pii_ would be returned, in which case you can loop over them and start deleting.

Best,

0
On

I agree with the @ThinkFloyd when he saying: Delete on server should be delete on server, rather than requiring three steps like get data from server, iterate over it on client side and finally for each record fire delete on the server again.

In this regards, I think old fashioned RDBMS were better all you need to do is 'DELETE * from database where something=something'.

Fortunately, there is something similar to SQL is available in CouchBase called N1QL (pronounced nickle). I am not aware about JavaScript (and other language syntax) but this is how I did it in python.

Query to be used: DELETE from <bucketname> b where META(b).id LIKE "%"

    layer_name_prefix = cb_layer_key + "|" + "%"
    query = ""
    try:
        query = N1QLQuery('DELETE from `test-feature` b where META(b).id LIKE $1', layer_name_prefix)
        cb.n1ql_query(query).execute()
    except CouchbaseError, e:
        logger.exception(e)

To achieve the same thing: alternate query could be as below if you are storing 'type' and/or other meta data like 'parent_id'.

DELETE from <bucket_name> where type='Feature' and parent_id=8;

But I prefer to use first version of the query as it operates on key, and I believe Couchbase must have some internal indexes to operate/query faster on key (and other metadata).