C# Apache Spark orc file path exist on adls

192 Views Asked by At

Spark newbie here. I've got a large set of data that is collected and stored in a folder respective to the date it occurred on on ADLS. Each folder is named according to the date (example: <2020-12-04>). I am trying to query the most recent data that occurred within the last 30 days. Currently, I'm trying to read from adls and try to switch out the date until i get a hit but I'm unable to find a way to check if the path provided is valid. resulting in an error. Any pointers would be helpful

while !folderFound
{
  string path = $"adls://<adlsaccount>/{listofdates[i]}/<file>;
  DataFrame df = spark.Read().orc(path); //need to know if the path is valid so it doesn't error
  .
  .
  .
}
  do some work once we get a successful read
0

There are 0 best solutions below