I am writing a Google Dataflow Pipeline and as one of the Sources I require a MySQL resultset via a query. A couple of questions then:
- What would be proper way to extract data from MySQL as a step in my pipeline, can this simply be done in-line using JDBC?
- In the case that I indeed do need to implement "User-Defined Data Format" wrapping MySQL as a source, does anyone know if an implementation already exists and I do not need to reinvent the wheel? (don't get me wrong I would enjoy writing it, but I would imagine this would be quite a common scenario to use MySQL as a source)
Thanks all!
A JDBC connector has been just added to Apache Beam (incubating). See JdbcIO.