In GCP we can see the pipeline execution graph. Is the same possible when running locally via DirectRunner?
Way to visualize Beam pipeline run with DirectRunner
1.1k Views Asked by Dimon Buzz At
2
There are 2 best solutions below
0

You can also use Python's RenderRunner, e.g.
python -m apache_beam.examples.wordcount --output out.txt \
--runner=apache_beam.runners.render.RenderRunner \
--render_output=pipeline.svg
This also has an interactive mode, triggered by passing --port=N
(where 0 can be used to pick an unused port) which vends the graph as a local web service. This allows one to expand/collapse composites for easier exploration. Any --render_output
arguments that are passed will get re-rendered as you edit the graph. (It uses graphviz under the hood, so can render any of those supported formats.)
For rendering non-Python pipelines, one can start this up as a local portable "runner."
python -m apache_beam.runners.render
and then "submit" this job from your other SDK over the provided jobs API endpoint via a portable runner to view it.
You can use
pipeline_graph
and theInteractiveRunner
to get a graphviz representation of your pipeline locally.An example for the word count pipeline used in the Beam documentation:
This prints
Putting this into https://edotor.net results in:
You can work with GraphViz in Python to produce a prettier output if needed (graphviz for example).