Issue with fstream::tellg() and ::seekg()

901 Views Asked by At

EDIT

A solution exists - opening the file with std::ios::binary instead of in text mode (Despite the file being pure text) solves the issue. tellg() now reports the correct location, so seekg() can now be used to go back to the same place.

END EDIT

A while back, we had a server issue. The primary server is fine but the secondary had a slight hiccup. We want to know whether anything was logged differently during that time.

We know for a fact that there are other differences between log files - they are of different length because of logging happening when one or the other has been down earlier. As such, we cannot do a straight line-by-line compare - we might get thousands of lines out that are reported to be logged differently yet just be offset by two lines extra logged by one of the servers near the top without being logged by the other.

I am writing a program to compare the output text files based on their time-stamps; that is, go to the area around the timestamp where the hiccup occurred and compare line-by-line only for that area. This is where the issue with tellg() and seekg() occurs.

First, this loop runs for both journal files:

while( getline(inputFile, inputLine) && loopBreak != 1 && inputFile.good() ){

    if (firstOpen.empty()){

        if ( -1 != inputLine.find("TimeOfFirstEntry") ){

            firstOpen = inputLine;
        }
    }

    if (!firstOpen.empty() && lastOpen.empty()){

        if ( -1 != inputLine.find("TimeOfLastEntry") ){

            lastOpen = inputLine;
        }
    }

    if (!firstOpen.empty() && !lastOpen.empty()){

        if ( -1 != inputLine.find(timeStamp) ){

            if (0 == firstLineInBucket){

                firstLineInBucket = inputFile.tellg();
                lastLineInBucket = inputFile.tellg();
            }
            else{

                lastLineInBucket = inputFile.tellg();
            }
        }

        if ( (0 != firstLineInBucket) && (0 != lastLineInBucket) ){

            if ( (-1 != inputLine.find("OccuranceTime") ) && (-1 == inputLine.find(timeStamp)) ){

                loopBreak = 1;
            }
        }
    }
}

Later, this loop does the comparison:

if(inputPrimary.good() && inputSecondary.good() && output.good()){

    inputPrimary.seekg(Primary.getFirstBucket());
    inputSecondary.seekg(Secondary.getFirstBucket());

    std::string linePrimary;
    std::string lineSecondary;

    while(  getline(inputPrimary, linePrimary)
            && getline(inputSecondary, lineSecondary)
            && (inputPrimary.tellg() < Primary.getLastBucket())
            && (inputSecondary.tellg() < Secondary.getLastBucket()) ){

        if(linePrimary == lineSecondary){

            //Do nothing
        }
        else{

            output << linePrimary << " .:|:. " << lineSecondary << "\n";
        }
    }

inputPrimary.close();
inputSecondary.close();
output.close();
}

Here's where things get weird: When I ran this over a pair of files for a timestamp that is known good (That is, they both have the same content for that time stamp), content was written to the output file. Further investigation reveals that the line that was written was reported to be different in the two files because for the secondary servers log file, the seekg() operation put the position at the start of the line but for the primary servers log file, the seekg() operation put the position ~11 characters further along from the last line break.

What on earth could cause that?

2

There are 2 best solutions below

1
On

(and yes, relative seeks aren't expected to work on files opened in text mode) -Cubbi

So there's the answer - opening in std::ios::binary mode fixes the problem fine.

10
On

Any good diff program will show you differing blocks. You can then load the result into a graphical comparison program, scan the results visually, ignore the blocks that you know are different, then scream when you hit the block you didn't expect.

For example, WinMerge:


(source: majorgeeks.com)