sparkcontext for local spark cluster

259 Views Asked by At

Could someone please be so kind and tell me how to adapt the hdfs URIs in the following code so that they work against my local spark 'cluster'?

var lines = sparkContext.TextFile(@"hdfs://path/to/input.txt");  
// some more code
wordCounts.SaveAsTextFile(@"hdfs://path/to/wordcount.txt");  
1

There are 1 best solutions below

1
Nitin On

You can just define path location config parameter will get setup on sparkcontext so no need to add hdfs just like below should be fine to run application in yarn mode

var lines = sparkContext.TextFile("/path/to/input.txt");  
// some more code
wordCounts.SaveAsTextFile("/path/to/wordcount.txt");  

or you can to explicitly define hdfs location as below

val lines =  sparkContext.textFile("hdfs://namenode:port/path/to/input.txt")

you can also define number of partitions which is optional

var lines = sparkContext.TextFile("/path/to/input.txt",[number of partitions]);