Are checkpoints needed for a stream processed in databricks job running via continuous trigger?

588 Views Asked by At

We have a requirement to process a data stream in a databricks notebook job and load to a delta table.

I noted that there was a new "Continuous" trigger available for databricks jobs and we started using itenter image description here

we created a function like above to read the stream.

So far there is only one consumer group that is reading it.

Will this need checkpoints or not .. I am a newbie with the streams so any guidance/ best practices would be helpful

1

There are 1 best solutions below

1
Alex Ott On BEST ANSWER

Yes, checkpoint is necessary anyway independent of the job type. continuous just means that Databricks Workflows manager will take care for making sure that your job is always running. But checkpoint is necessary to track what data were already processed in case if the job is restarted for some reason (crash, deployment of the new version, etc.)