I have an elasticsearch index which has multiple documents, now I want to update the index with some new documents which might also contain duplicates of the existing documents. What's the best way to do this? I'm using elasticsearch py for all CRUD operations
How to upsert documents in an existing elasticsearch index?
8.2k Views Asked by David At
1
There are 1 best solutions below
Related Questions in ELASTICSEARCH
- Elasticsearch schema for multiple versions of the same text
- Elasticsearch nested filter query
- Elasticsearch data model
- search with filter by token count
- Usage of - operator in elasticsearch
- Running multiprocessing on two different functions in Python 2.7
- How to get an Elasticsearch aggregation with multiple fields
- How to implement custom sort in elasticsearch?
- Custom Analyzer not working Elasticsearch
- How to implement full text search using Elasticsearch in Rails?
- UnresolvedAddressException in Logstash+elasticsearch
- Elasticsearch Fiddler No DNS
- Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr
- how to disable page query in Spring-data-elasticsearch
- Create Custom Analyzer after index has been created
Related Questions in ELASTICSEARCH-PY
- Getting Parse error for elasticsearch-py
- Elasticsearch doesn't allow me to add long dict [TransportError(400)]
- Elasticsearch-py bulk helper equivalent of curl with file
- How to upsert documents in an existing elasticsearch index?
- Elasticsearch - Authentication error when check to see if index exist
- Elasticsearch Parent-Child Mapping and Indexing
- Bulk index / create documents with elasticsearch for python
- Sequential Searching Across Multiple Indexes In Elasticsearch
- Elastic Search merge two indices
- Using multi_match with Elasticsearch bulk scan in Python
- Elsaticsearch 6.3.1 provides different results on cloud and local despite using dfs_query_then_fetch. Query using python's elasticsearch package
- How to calculate total for each token in Elasticsearch
- python elasticsearch: how to query a string on all fields
- How to get last modified record from Elasticsearch, when there is no timestamp field?
- Replication es.search (index_results.execute()) for es.scroll
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Every update in elasticsearch deletes the old document and create a new document as the smallest unit of document collection is called segments in elastic-search which are immutable, hence when you index a new document or update any exiting documents, it gets into the new segments which are merged into bigger segments during the merge process.
Now even if you have duplicate data but with the same id, it will replace the existing document, and its fine and more performant than first fetching the document and than comparing both the documents to see if they are duplicate and than discard the update/upsert request from an application, rather than just index whatever if coming and ES will again insert the duplicate docs.