I would like to read a file from HDFS into Spark via httpfs or Webhdfs. Something along the lines of
sc.textFile("webhdfs://myhost:14000/webhdfs/v1/path/to/file.txt")
or, ideally,
sc.textFile("httpfs://myhost:14000/webhdfs/v1/path/to/file.txt")
Is there a way to get Spark to read the file over Webhdfs/httpfs?
I believe WebHDFS/ HttpFS are like streaming sources to transmit the data over REST-API.
Then Spark Streaming can be used to receive the data from the WebHDFS/ HttpFS.