I have to change the structure of all existing documents in one of my CouchDB databases that contain a certain field. Right now, the field is just a simple String, for example:
{
// some other fields
"parameters": {
"typeId": "something",
"otherField": "dont_care"
}
}
In this example, the field I'm interested is "typeId". I want to make it an array of Strings because the requirements for this was modified :( But I obviously need to keep the current value of the field in all documents! So, from the example above, the result would be:
{
// some other fields
"parameters": {
"typeId": [ "something" ] // now we can have more items here
"otherField": "dont_care"
}
}
Any ideas how this can be achieved??
Just in case this helps: my Java web-application communicates with CouchDB through the Ektorp library.
I would say first write a function (or method, or class) that converts old-style documents into new-style documents and also correctly handles irrelevant documents (such as a design document) if necessary. Write some unit tests until you are confident about this code.
The next step is basically a loop of finding old-style documents and updating them to become new-style documents, using your modification routine.
If you have a small data set, you can simply query
/_all_docs?include_docs=true
and work on your entire data set in one batch. If you have a larger data set, perhaps write a view which will identify old-style documentsThis view will show you all old-style documents to do. To grab 50 more documents to convert, GET
/my_db/_design/converter/_view/to_do?limit=50
. Each row's"value"
field will be a complete copy of the document, so you can run it through your converter function immediately.Once you convert a document, you can either POST it back to the database, or build up a batch and use
_bulk_docs
to do the same. (Bulk docs is the same thing, just a little faster.) As each document is stored, it will disappear from theto_do
view. (If you get a409 Conflict
error, just ignore it.) Re-run this procedure until there are 0 rows into_do
and you're done!You can judge from your situation how careful you need to be. If this is production data, you had better write good unit tests! If it is a development environment, just go for it!
A final trick is to create a new, empty database and replicate your main database to it. Now you have a duplicate sandbox to try your ideas. You can delete and re-replicate your sandbox until you are happy with your results.