PouchDB compaction has no affect on db size on disk

801 Views Asked by At

I make use of couchdb/pouchdb to replicate data from a server down to a mobile device (data is only replicated downstream to the mobile device).

As my dataset is large (approx 100,000 documents) i'm try to use the compact feature in pouchdb to ensure that the size of the database on disk remains small and does not grow to an unmanagable size.

My tests however are showing that manually compacting the database is having no affect on disk space used. In my tests i am replicating 100,000 documents from my couchdb to a pouchdb using Chrome.

Looking at the "C:\Users[USERNAME]\AppData\Local\Google\Chrome\User Data\Default\databases[SERVER_URL]" directory, which i believe Chrome saves the database too, the file that is generated after replicating the database is approx 71MB.

I then update 20,000 documents in the couchdb by simply incrementing a value on each. I subsequently replicate these changes down to my pouchdb in Chrome. This results in the database growing to 81MB. Manually compacting the database after this does not affect the size of the pouchdb on disk. I have performed this sequence of operations a few times and have never seen the pouchdb file created by Chrome decrease in size.

Summary of test:

  1. Populate couchdb with 100,000 docs
  2. Replicate couchdb to pouchdb using Chome. Results in 71MB database.
  3. Update 20,000 docs in couchdb. Replicate to pouchdb. Results in 81MB database.
  4. Manually compact pouchdb. No change in file size.
  5. Update 20,000 docs in couchdb. Replicate to pouchdb. Results in 83MB database.
  6. Manually compact pouchdb. No change in file size.
  7. Update 20,000 docs in couchdb. Replicate to pouchdb. Results in 84MB database.
  8. Manually compact pouchdb. No change in file size.

I have created a small example application to illustrate my example. You can find the files here (please read readme.txt!) You can use this to reproduce my test.

Am i misunderstanding what compact does in pouchdb? I assumed that by deleting old revisions of documents (and only keeping leaf-nodes) that the disk size of my pouchdb would stay relatively the same size in the above example.

Or have i made a silly coding error? (Not unusual for me!).

Thanks for your help in advance,

Andrew.

UPDATE - After performing the above test I decided to use pouch to retrieve one of my documents. I found that _revisions was an array which contained 4 elements. Is the reason why my database is continually growing in size because pouchdb keeps a track of all the revision ids for a document? Should this be the case if i compacted my database?

2

There are 2 best solutions below

0
On BEST ANSWER

It appears you are using the WebSQL adapter. SQLite has an odd characteristic, which is that it doesn't necessarily clean up its space usage unless you do an explicit VACUUM command. Off the top of my head, I don't know if VACUUM is even available in WebSQL, but you may want to try it after compaction in order to truly clean up the size of the database.

If that fix works, we may also be interested in adding the VACUUM command to PouchDB itself when you compact, so that you don't have to do it manually. :)

0
On

@nolan, thanks for this pointer. I added the following line of code in the setup() function in pouchdb.cordova-sqlite.js which is the pouchDB adapter for sqllite.

tx.executeSql("PRAGMA auto_vacuum = FULL");

It works! It does free up disk space (I checked it on our iOS app). Thanks.