Zeppelin Python Flink cannot print to console

364 Views Asked by At

I'm using Kinesis Data Analytics Studio which provides a Zeppelin environment.

Very simple code:

%flink.pyflink

from pyflink.common.serialization import JsonRowDeserializationSchema
from pyflink.common.typeinfo import Types
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.datastream.connectors import FlinkKafkaConsumer

# create env = determine app runs locally or remotely

env = s_env or StreamExecutionEnvironment.get_execution_environment()
env.add_jars("file:///home/ec2-user/flink-sql-connector-kafka_2.12-1.13.5.jar")

# create a kafka consumer

deserialization_schema = JsonRowDeserializationSchema.builder() \
    .type_info(type_info=Types.ROW_NAMED(
      ['id', 'name'], 
      [Types.INT(), Types.STRING()])
    ).build()

kafka_consumer = FlinkKafkaConsumer(
    topics='nihao',
    deserialization_schema=deserialization_schema,
    properties={
      'bootstrap.servers': 'kakfa-brokers:9092', 
      'group.id': 'group1'
})

kafka_consumer.set_start_from_earliest()

ds = env.add_source(kafka_consumer)

ds.print()

env.execute('job1')

I can get this working locally can sees change logs being produced to console. However I cannot get the same results in Zeppelin.

Also checked STDOUT in Flink web console task managers, nothing is there too.

Am I missing something? Searched for days and could not find anything on it.

1

There are 1 best solutions below

0
On

I'm not 100% sure but I think you may need a sink to begin pulling data through the datastream, you could potentially use the included Print Sink Function