I want to list all the FILES within the specified directory and subdirectories within that directory. No directories should be listed.
My current code is below. It does not work properly as it only lists the files and directories within the specified directory.
How can I fix this?
final List<Path> files = new ArrayList<>();
Path path = Paths.get("C:\\Users\\Danny\\Documents\\workspace\\Test\\bin\\SomeFiles");
try
{
DirectoryStream<Path> stream;
stream = Files.newDirectoryStream(path);
for (Path entry : stream)
{
files.add(entry);
}
stream.close();
}
catch (IOException e)
{
e.printStackTrace();
}
for (Path entry: files)
{
System.out.println(entry.toString());
}
Using Rx Java, the requirement can be solved in a number of ways while sticking to usage of DirectoryStream from JDK.
Following combinations will give you the desired effect, I'd explain them in sequence:
Approach 1. A recursive approach using flatMap() and defer() operators
Approach 2. A recursive approach using flatMap() and fromCallable operators
Note: If you replace usage of flatMap() with concatMap(), the directory tree navigation will necessarily happen in a depth-first-search (DFS) manner. With flatMap(), DFS effect is not guaranteed.
Approach 1: Using flatMap() and defer()
This approach is finding children of given directory and then emitting the children as Observables. If a child is a file, it will be immediately available to a subscriber else flatMap() on Line X will invoke the method recursively passing each sub-directory as argument. For each such subdir, flatmap will internally subscribe to their children all at the same time. This is like a chain-reaction which needs to be controlled.
Therefore use of Runtime.getRuntime().availableProcessors() sets the maximum concurrency level for flatmap() and prevents it from subscribing to all subfolders at the same time. Without setting concurrency level, imagine what will happen when a folder had 1000 children.
Use of defer() prevents the creation of a DirectoryStream prematurely and ensures it will happen only when a real subscription to find its subfolders is made.
Finally the method returns an Observable < Path > so that a client can subscribe and do something useful with the results as shown below:
Disadvantage of using defer() is that it does not deal with checked exceptions nicely if its argument function is throwing a checked exception. Therefore even though DirectoryStream (which implements Closeable) was created in a try-resource block, we still had to catch the IOException because the auto closure of a DirectoryStream throws that checked exception.
While using Rx based style, use of catch() blocks for error handling sounds a bit odd because even errors are sent as events in reactive programming. So why not we use an operator which exposes such errors as events.
A better alternative named as fromCallable() was added in Rx Java 2.x. 2nd approach shows the use of it.
Approach 2. Using flatMap() and fromCallable operators
This approach uses fromCallable() operator which takes a Callable as argument. Since we want a recursive approach, the expected result from that callable is an Observable of children of given folder. Since we want a subscriber to receive results when they are available, we need to return a Observable from this method. Since the result of inner callable is an Observable list of children, the net effect is an Observable of Observables.
A subscriber will then need to flatten the results stream as shown below:
In traverse() method, why is line X using blocking Get
Because the recursive function returns an Observable < Observable >, but flatmap at that line needs an Observable to subscribe to.
Line Y in both approaches uses concatMap()
Because concatMap() can be comfortably used if we don't want parallelism during innner subscriptions made by flatmap().
In both approaches, the implementation of method isFolder looks like below:
Maven coordinates for Java RX 2.0
Imports in Java file