What backend does Flink Table API use ? Does it require any relational DB?

207 Views Asked by ardhani At 29 July 2025 at 17:26

I'm fairly new to Flink and trying to understand appropriate use cases where Stream API/ Table API can be used. As part of it trying to understand

like Stream API, does Table API has the flexibility to choose the type of state backend it can use?
What are all the available backend for Table API, does it require any external datastores like My SQL? or any other datastore?

In short, trying to understand the work around the backend used by Table API.

Original Q&A

There are 1 best solutions below

BenoitParis On 29 July 2020 at 10:22

Per se, Flink might not be a good long term persistent store. It is more like a processing system. You'll want to have your long term persistent state in MySQL/Kafka/Cassandra/S3/etc.

This being said, some computations require internal state bookkeeping: When you do

SELECT word, count(*) FROM words GROUP BY word

Some kind of integer transient state is used per word. Now your job wouldn't be safe from machine failures and restart. This is why the state backend exists. It can save where it was in the computation (eg Kafka offsets) as well the values the counts were at.

So, to answer your questions:

It does. The internal Blink query planner uses the same code (save for different application of execution plan rules). It does make a bit more sense to me in a streaming context, or very expensive batch job (you might want to use cheap AWS Spot instances for Task Managers, and you would be robust to instance preemption). This page might help you make a choice.
The state backends are in the link I provided. Now, you might want to read from where your data currently is, and would be going post-computation. Lots of data stores are suported. With a very large grain of salt, the distinction is: some can stream data: the Datastream connectors; some cannot: the Table / SQL connectors. For example: A MySQL JDBC Datastream connector will only have a sink, while it could be both a sink and a source in the Table API.

As a side note: the state backend is indeed queryable; but IMHO is better suited for debugging purposes.

What backend does Flink Table API use ? Does it require any relational DB?

There are 1 best solutions below

Related Questions in APACHE-FLINK

Related Questions in FLINK-STREAMING

Related Questions in FLINK-SQL

Related Questions in FLINK-TABLE-API

Trending Questions

Popular # Hahtags

Popular Questions