I am facing a problem with streams' dropWhile or takeWhile methods due to which spliterator is skipping portions of text in a specific pattern odd or even. What should be done to process all portions of text?
My methods here:
void read(Path filePath) {
try {
Stream<String> lines = Files.lines(filePath);
while (true) {
Spliterator<String> spliterator = lines.dropWhile(line -> !line.startsWith("FAYSAL:")).spliterator();
Stream<String> portion = fetchNextPortion(spliterator);
if(spliterator.estimateSize() == 0)
break;
portion .forEach(System.out::println);
lines = StreamSupport.stream(spliterator, false);
}
lines.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
private Stream<String> fetchNextPortion(Spliterator<String> spliterator) {
return StreamSupport.stream(spliterator, false)
.filter(this::isValidReportName)
.peek(System.out::println)
.findFirst()
.map( first -> Stream.concat(Stream.of(first),
StreamSupport.stream(spliterator, false).takeWhile(line -> !line.startsWith("FAYSAL:")))).orElse(Stream.empty());
}
Sample input is:
FAYSAL: 1
Some text here
Some text here
FAYSAL: 2
Some text here
Some text here
FAYSAL: 3
Some text here
Some text here
FAYSAL: 4
Some text here
Some text here
It will skip FAYSAL: 2 and FAYSAL: 4
You could choose a different approach.
Your code produced a StackOverflowError on my machine (also there is a call to
fetchNextChunkbut a method calledfetchNextPartition, so I wasn't sure about that either) after displaying your problem, so instead of trying to debug it, I came up with a different way of splitting the input. Given that my approach contains the whole String in memory, it might not be suitable for larger files. I might work out a version with Streams later.Base assumption: You want to split your input text into portions, each portion starting with a string that starts with "FAYSAL:".
The idea is similar to your approach but not based on Spliterators and it doesn't use dropWhile either. Instead it finds the first string starting with "FAYSAL:" (I assumed that that was what
isValidReportNamedid; the code for the method wasn't in the question) and takes everything just up to the next portion start. Adding the found first element as first element of the list, the collection is then added to a list that can be later used. The amount of lines collected is then removed from the original list.Full code:
(Note: I used a static string here instead of file reading to make a full code example; you can adapt your code accordingly)
EDIT: After some research I found that grouping the things in a stream is surprisingly easy with a library called StreamEx (Github) (Maven). In this answer I found a note about the
StreamEx#groupRunsfunction which does exactly that:To see it working, you can add
to the main function and
somewhere in the Main class of the above full code example.