We are having a usecase in Apache Storm , where we need get data from a source system , and then perform some operation on the tuple that is recieved but also want to look up the data in database. But making a Database Call everytime for millions of records is not feasible. So is there a way where we can load a distributed hash map on start up and when the tuple is processed in Bolt or Spout, first lookup this hash map and if the value is not present in the HashMap, then make the Datbase Call and update the corresponding Map which should be accessible across.
How to maintain a distributed HashMap in Apache Storm Cluster
279 Views Asked by Lijo wilson At
1
There are 1 best solutions below
Related Questions in APACHE-STORM
- ERROR: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "maprfs"
- Use rack aware policy for kafka in apache storm
- Apache storm + Kafka Spout
- Getting classCastException when upgrade from strom/zookeepr 2.5/3.8.0 to 2.6/3.9.1
- Does SGX or Gramine support mmap files?
- Apache Storm: Get Blob download exception in Nimbus log
- Apache Storm: can't receive tuples from multiple bolts
- How to make apache storm as FIPS (Federal Information Processing Standard ) compliant
- one bolt recive from 2 others in streamparse python
- How to deploy a topology on Apache Storm Nimbus deployed on AWS ECS
- How to store custom metatags in elasticsearch index from a website using stormcrawler
- conf/storm.yaml is not populated with values coming from config map
- How to process late tuples from BaseWindowedBolt?
- Unable to Start this Storm Project
- Handing skewed processing time of events in a streaming application
Related Questions in APACHE-STORM-TOPOLOGY
- one bolt recive from 2 others in streamparse python
- Apache Storm UI visualisation window not working
- Apache storm 2.4.0 working slow compared to apache storm 0.9.6
- Storm 2.4.0 - Worker Heapsize Issue
- Apache storm tumbling window producing duplicate tuples in case of message timeout reached
- Apache storm metrics reporter not working as expected
- Topology does not execute on local cluster
- can we setup single storm ui for all nimbus?
- Apache Storm tuple timed out after 10 minutes but topology.message.timeout.secs is configured as 5 minutes
- Apache storm static variable reference from bolt
- Apache Storm assign tasks to the same executor thread
- Apache Strom upgrade from 1.0.3 to 2.2.0 and not all workers are used
- Dependency Injection in Apache Storm topology
- Apache Storm problem with metadata scheduler
- Apache Storm Starter 2.2.0 in Eclipse in Windows - Exception while trying to get leader nimbus info from localhost NimbusLeaderNotFound
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
There is nothing built in (i.e. without running external services) that would be accessible to the entire topology, since your bolts will likely run in different JVMs or even on different hosts. If you need a distributed cache, look at something like Redis https://redis.io/.
You might want to look at https://storm.apache.org/releases/2.0.0-SNAPSHOT/State-checkpointing.html, the API should be able to do what you want, and there's support for Redis integration. If you don't need the checkpointing functionality, you can of course also just use Redis directly.