google.guava.CountingInputStream does not give correct offset while iterating over the file line by line

139 Views Asked by At

I am trying to get the line start offset with reading a file from start to end. My purpose is to build an index [Line_number --> start_Offset in file]. Used following Guava library, Sample code and library used are as below.

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>28.0-jre</version>
</dependency>

try {
    ByteSource byteSource = Files.asByteSource(new File(filePath));
    CountingInputStream cStream = new CountingInputStream(byteSource.openStream());
    Iterator<String> lines = byteSource.asCharSource(defaultCharset()).lines().iterator();
    //lines.forEachRemaining((line) -> System.out.println(input.getCount()));
    while (lines.hasNext()) {
        //String line = lines.next();
        //System.out.println(line);
        System.out.println("OFFSET:" + cStream.getCount());
    }   
} catch (IOException e) {
    e.printStackTrace();
}

Output:

OFFSET:0
OFFSET:0
OFFSET:0
OFFSET:0
.
.
.

Though it prints the content of the line (If I uncomment the code - System.out.println(line)), all the lines OFFSET is 0. Why the correct offset (line_start) is not returned and what is missing in above code-snippet? Do I need to manually skip the line cStream.skip(line.length()); every time and then do cStream.getCount()? Is there any other recommended way to achieve this?

0

There are 0 best solutions below