elasticsearch reindex documents reduce free space

445 Views Asked by At

steps: - elasticsearch 2.3 - create documents in ES => ~ 1 GB of disk is used - update same documents in ES => ~ 2 GB of disk is used

Why it happens? Is it due to versioning? Is it possible to avoid doubling disk usage?

Currently we use forcemerge (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-forcemerge.html) but it takes some hours.

1

There are 1 best solutions below

0
On

When you index a document in ES that already exists, ES will mark the previous document as deleted (but won't immediately remove it from the index), and index the new document.

Effectively, if your document weighs 1K, once you have reindexed a new version of your document, the space taken by the first document won't be reclaimed immediately. So, the first "version" of the document takes 1K and the second "version" of the document another 1K. The only way to remove deleted documents is to call the Force Merge API as you have discovered, or to wait until segments are merged automatically under the hood. You should not really have to worry about this process.