I'm reading a file using something like:
std::ifstream is( file_name );
std::string s;
if( !is.good() ) {
std::cout << "error opening file: " << file_name << std::endl;
} else {
while( !is.eof() ) {
s.clear();
is >> s;
if (s.empty()) continue;
if (s.size() < 1 || s.size()>0x7FFFFFFF ) {
std::cout << "implausible data" << std::endl;
continue;
}
char *ss = new char[ s.size() + 1 ]; // COVERITY bails out
// do something with the data
delete[]ss;
}
}
When I analyse the above code with the static code analysis tool coverity (free version), the line marked with COVERITY bails out throws an error:
Untrusted value as argument (TAINTED_SCALAR)
tainted_data: Passing tainted variable > s.size() + 1UL to a tainted sink.
I understand that I must not trust any data read from a file, but I fail to see how to validate the data at this stage.
I'm already checking that s.size()
is within a plausible (albeit rather large) range in the if
-clause above the erroneous line.
So why is coverity throwing a warning at me?
Also, which other strategies for input-validation should I apply?
In the following section
the validation logic relies on the important fact that
s.size()
will return the same value every time it is called. While in this case, we (humans) know this will be true, a static code analyzer might fail at realizing that.As a workaround, try introducing a local variable and use that one.
Here, it is simple for the analyzer to tell that
length
will not change its value.Whether or not such coding around the limitations of static analyzer tools is worthwhile is open to debate. The GNU Coding Standards discourage it.
Personally, I don't feel too bad about it as long as the readability of the code doesn't suffer too much. In corner cases, adding a comment that explains why things are done the way they are done might be a good idea.