I have a problem where i need to prioritize some events to be processes earlier and some events lets say after the high priority events. Those events come from one source and i need to prioritize the streams depending on their event type priority to be either forwarded in the high priority or lower priority sink. I'm using kafka and akka kafka streams. So the main problem is i get a lot of traffic at a given point in time. What would here be the preferred scenario?
Fast Processing Topic and Slow Processing Topic - Akka Kafka
364 Views Asked by rollercoaster At
1
There are 1 best solutions below
Related Questions in JAVA
- Add image to JCheckBoxMenuItem
- How to access invisible Unordered List element with Selenium WebDriver using Java
- Inheritance in Java, apparent type vs actual type
- Java catch the ball Game
- Access objects variable & method by name
- GridBagLayout is displaying JTextField and JTextArea as short, vertical lines
- Perform a task each interval
- Compound classes stored in an array are not accessible in selenium java
- How to avoid concurrent access to a resource?
- Why does processing goes slower on implementing try catch block in java?
- Redirect inside java interceptor
- Push toolbar content below statusbar
- Animation in Java on top of JPanel
- JPA - How to query with a LIKE operator in combination with an AttributeConverter
- Java Assign a Value to an array cell
Related Questions in SCALA
- Spark .mapValues setup with multiple values
- Where do 'normal' println go in a scala jar, under Spark
- Serializing to disk and deserializing Scala objects using Pickling
- Where has "Show Type Info on Mouse Motion" gone in Intellij 14
- AbstractMethodError when mixing in trait nested in object - only when compiled and imported
- Scala POJO Aggregator Exception
- How to read in numbers from n lines into a Scala list?
- Spark pairRDD not working
- Scala Eclipse IDE compiler giving errors until "clean" is run
- How to port Slick 2.1 plain SQL queries to Slick 3.0
- Log of dependency does not show
- Getting unary error for escaped characters in Scala
- Akka actor invoked with a function delegate - is this bad practice?
- Json implicit format with recursive class definition
- How to create a executable jar reading files from local file system
Related Questions in APACHE-KAFKA
- Spark streaming + kafka throughput
- How to diagnose Kafka topics failing globally to be found
- kafka: what do 'soTimeout', 'bufferSize' and 'minBytes' mean for SimpleConsumer?
- Fail to create SparkContext
- Syntax error on tokens, delete these tokens - kafka spring integration demo application
- How could Kafka 0.8.2.1 with offsets.storage=kafka still require ZooKeeper?
- Message Queues: Per Message Guarantees
- How should a Kafka HLC figure out the # of partitions for a topic?
- Kafka multiple consumers for a partition
- Should Apache Kafka and Hadoop be installed seperatedly (on a diffrent cluster)?
- how does one combine kafka-node producer and node tail?
- How to fix NoClassDefFoundError with custom Kafka producer under Eclipse?
- Apache Samza's CheckpointTool won't give away partition offsets
- Offsets for Kafka Direct Approach in Spark 1.3.1
- Simulate kafka broker failures in multi node kafka cluster and what operations and tools to use to mitigate data loss issues
Related Questions in AKKA-STREAM
- Akka-http process requests with Stream
- Accessing the underlying ActorRef of an akka stream Source created by Source.actorRef
- Exceeded configured max-open-requests
- Akka Stream OnNext is not allowed
- How to respond with the result of an actor call?
- How to create a Source that can receive elements later via a method call?
- How to print dropped elements in a stream with OverflowStrategy?
- Can I implement my own OverflowStrategy?
- Live resources in Akka Stream flow description
- Collecting sink materialized values as source
- Akka Streams flow to handle paginated results doesn't complete
- How to make htttp request with akka stream for 10K request
- How do I create a Sink to read all bytes from the Source?
- Allow single connection with Akka stream as tcp server
- Elegant way of reusing akka-stream flows
Related Questions in AKKA-KAFKA
- Resetting offset for partition after (Re-)joining group
- Is this Akka Kafka Stream configuration benefits from Back Pressure mechanism of the Akka Streams?
- avro4s can not deserialize AnyRef
- How Akka stream internal and explicit buffer interact with the underlying kafka client settings in alpakka Kafka?
- Akka Kafka stream supervison strategy not working
- Fast Processing Topic and Slow Processing Topic - Akka Kafka
- Reactive-Kafka Stream Consumer: Dead letters occured
- How to group messages so that they go to the same partition every time?
- Akka streams Kafka consumer process parallel
- Incompatible equality constraint while using Akka Kafka Streams
- What is a good pattern for committing Kafka consumer offset after processing message?
- How do you return a future containing a list of messages after all available messages have been consumed from a Kafka topic?
- Getting last message from kafka Topic using akka-stream-kafka when connecting with websocket
- Troubles with AVRO schema update
- Kafka producer with adjustable amount of messages per second
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
The first thing to tackle is the offset commit. Because processing will not be in order, committing offsets after processing cannot guarantee at-least-once (nor can it guarantee at-most-once), because the following sequence is possible (and the probability of this cannot be reduced to zero):
This then suggests that either the offset commit will have to happen before the reordering or we'll need a notion of processed-but-not-yet-committable until the low-priority messages have been processed. Noting that for the latter option, tracking the greatest offset not committed (the simplest strategy which could possibly work) will not work if there's anything which could create gaps in the offset sequence which implies infinite retention and no compaction, I'd actually suggest committing the offsets before processing, but once the processing logic has guaranteed that it will eventually process the message.
A combination of actors and Akka Persistence allows this approach to be taken. The rough outline is to have an actor which is persistent (this is a good fit for event-sourcing) and basically maintains lists of high-priority and low-priority messages to process. The stream sends an "ask" with the message from Kafka to the actor, which on receipt classifies the message as high-/low-priority, assuming that the message hasn't already been processed. The message (and perhaps its classification) is persisted as an event and the actor acknowledges receipt of the message and that it commits to processing it by scheduling a message to itself to fully process a "to-process" message. The acknowledgement completes the ask, allowing the offset to be committed to Kafka. On receipt of the message (a command, really) to process a message, the actor chooses the Kafka message to process (by priority, age, etc.) and persists that it's processed that message (thus moving it from "to-process" to "processed") and potentially also persists an event updating state relevant to how it interprets Kafka messages. After this persistence, the actor sends another command to itself to process a "to-process" message.
Fault-tolerance is then achieved by having a background process periodically pinging this actor with the "process a to-process message" command.
As with the stream, this is a single-logical-thread-per-partition process. It's possible that you are multiplexing many partitions worth of state per physical Kafka partition, in which case you can have multiple of these actors and send multiple asks from the ingest stream. If doing this, the periodic ping is likely best accomplished by a stream fed by an Akka Persistence Query to get the identifiers of all the persistent actors.
Note that the reordering in this problem makes it fundamentally a race and thus non-deterministic: in this design sketch, the race is because for messages M1 from actor B and M2 from actor C sent to actor A may be received in any order (if actor B sent a message M3 to actor A after it sent message M1, M3 would arrive after M1 but could arrive before or after M2). In a different design, the race could occur based on speed of processing relative to the latency for Kafka to make a message available for consumption.