If the same document is ingested at two different times, how to have the same id in Elasticsearch

Question

If the same document is ingested at two different times, how to have the same id in Elasticsearch

19 Views Asked by omnes_flumina At 27 March 2024 at 20:07

Let's say I have 50 documents I want to ingest into an index. So I do this and I have 50 documents I can retrieve if I were to query Elasticsearch.

At a later time, perhaps through an automated process, these same 50 documents end up getting ingested again. I do a query and I see pairs of documents with everything except for different _id value. Actually the _id value which is 20 characters for the pair has characters 0,1 different and characters 16-19 different but characters 2-15 the exact same. I assume these _id are autogenerated, maybe the first 2 characters being some sort of sequence number?

But how would I go about having the document, each time, map to the same _id?

I expect each unique document to map to the same _id value so that my index is not filled up with the exact same information multiple times.

Original Q&A

There are 1 best solutions below

**Tim** · Answer 1 · 2024-03-28T00:34:47.123000

You can use the fingerprint ingest processor to calculate an _id field from the fields in your document.

You'll need to decide which fields to use, that is, what makes a document "unique" - is it all fields, or are there specific identifier fields you want to use.

Then you define an ingest pipeline such as

PUT _ingest/pipeline/auto-id
{
  "processors": [
    {
      "fingerprint": {
        "fields": ["field1", "field2", "field3"],
        "target_field": "_id"
      }
    }
  ]
}

When you ingest documents, you specify that you want to use this pipeline

POST my-index/_doc?pipeline=id-fingerprint
{
  "field1": "one",
  "field2": "2",
  "field3": 3,
}

If the same document is ingested at two different times, how to have the same id in Elasticsearch

There are 1 best solutions below

Related Questions in ELASTICSEARCH

Trending Questions

Popular # Hahtags

Popular Questions