Our production MarkLogic DB is having 1.2 TB data divided among 6 forests. We plan to add 2 new forests to reduce stands per forest count.
Now, adding new forests starts rebalancing the data. That's okay, it takes time. But this rebalancing time keeps shooting up whenever merges start alongside rebalancing. Sometimes it takes, estimated 8 hours to suddenly to 16 hours. So, on average the whole process is taking approximately 24 hours.
My question is - If we disable the merge before adding the new forests and enable the manual merge soon after rebalancing completes( after adding forests), would the combined process be faster? And, will it be safe to do this?
Anything that affects disk IO will affect the speed of rebalancing, including merging and standard database activity, however care should be taken if you are disabling merging.
The risk of disabling merging, is that you prevent the system from pruning stands, so if too many stands accumulate you may hit the hard limit, which will impact server operation.
If merging is having such a heavy impact, then you can look at tuning the merge configurations. More information can be found in the documentation.
Dangers of Disabling Merges
Understanding and Controlling Database Merges
Setting Merge Policy
Configuring Merge Policy Rules