I am currently using Kafka and Flink in conjunction, where I read data from Kafka, process it in Flink using SQL, and then send it back to Kafka.
The issue I'm facing is that Flink operates well initially, but after a while, Backpressure occurs, causing the processing to come to a halt.
I would like to know if there are any settings or configurations available to ensure a smooth flow of this pipeline.
Captured web screen when an issue (Backpressure(max):100% - Busy(max):100%) occurred. enter image description here
The SQL query corresponding to Busy(max) is this one
SELECT last.Did, split.Sid
FROM (
SELECT Did, rowtime, Sid
FROM (
SELECT Did, Sid, rowtime, ROW_NUMBER()
OVER (PARTITION BY Did ORDER BY rowtime DESC) AS numrow
FROM composition
)
WHERE numrow = 1
) as last CROSS JOIN UNNEST(last.Sid) AS split (Sid)
I attempted clustering, but the issue has resurfaced once again
It seems like the SQL query needs normalization, and I need assistance with that.