I wish to process data (for example validate csv column) in HDFS
using Falcon. I have succesfully installed Falcon (version - Hortonworks Sandbox 2.1, Falcon -0.5.0.2.1.1.0
) and able to submit a job. However the job is not running and UI have nothing to start/stop the Job.
I wish to know how to validate the output of a job and proceed to another job depending on validation of first job - a workflow.
Usage of Falcon for Big data processing
332 Views Asked by pktippa At
2
There are 2 best solutions below
2

you mentioned that job was submitted. If you are using the command line of apache falcon, "submit" alone is not enough, "schedule" command should also be run. For falcon "submit" job will not make is go into running state, "schedule" is necessary.
you can refer to http://falcon.apache.org/0.6.1/FalconCLI.html for all the commands.
If you are looking for a custom logic you can create a oozie workflow and have that workflow submit a falcon job as the last task.
https://falcon.apache.org/EntitySpecification.html#Process_Specification
Hope it helps.