Get origin clang::ento::MemRegion from SVal of type clang::ento::nonloc::ConcreteInt

64 Views Asked by At

I would like to extract the relationships of the variables in a given C source file. More precisely, I would like to know which dependencies between the individual variables are generated via assignments. This should be path sensitive and therefore I use the Clang Static Analyser. An approach using the Clang AST (some kind of ASTMatcher or RecursiveASTVisitor) alone without the functionalities of the static analyzer is therefore not sufficient.

For this purpose, I have implemented a checker that uses various method such as clang::ento::check::checkBind. Then I extract the respective MemRegion and its name, which represents the name of the variable, from the two SVal arguments provides to the method.

extern int resA;
extern int resB;
extern int resC;
extern int x;

int main(){

    resA = x;
    resB = resA + resC;

    return 0;
}

For the example above, the checkBind method is called for each of the two assignments. With each call, the two SVal arguments are evaluated in order to obtain the MemRegions.

void checkBind(clang::ento::SVal Loc, clang::ento::SVal Val, const clang::Stmt *S, clang::ento::CheckerContext &CC) const {
    // Extract MemRegion which is represented by Loc and Val
}

If Val is an instance of nonloc::SymbolVal I iterate the SymExpr using Val.symbols() and use SymExpr::getOriginRegion() to obtain the corresponding MemRegion from it. In this way, the desired relationships can be extracted from the C code, as shown schematically below.

Loc = 'resA' (loc::MemRegionVal)
Val = 'x' (nonloc::SymbolVal) -> iter SymExprs
    Val = 'x' (OriginRegion)

MemReg = 'resB'
Loc = 'resB' (loc::MemRegionVal) -> iter SymExprs
    Val = 'resC' (OriginRegion)
    Val = 'x' (OriginRegion)

However, this changes as soon as variables are assigned initial values. Therefore, the SVal of the RHS of the first assignment is an SVal of the class nonloc::ConcreteInt. Consequently, this concrete value is used in the following.

Same example but providing an initial value to x:

extern int resA;
extern int resB;
extern int resC;
int x = 1;

int main(){

    resA = x;
    resB = resA + resC;

    return 0;
}

Leads to:

Loc = 'resA' (loc::MemRegionVal)
Val = 1 S32b (nonloc::ConcreteInt)

MemReg = 'resB'
Loc = 'resB' (loc::MemRegionVal) -> iter SymExprs
    Val = (reg_$2<int resC>) + 1
    Val = 'resC' (OriginRegion)

This behaviour is also desired by me in order not to influence path sensitive analysis. However, I would like to know from which MemRegion this ConcreteInt originates to get the same output as for the first code example.

I don't want to prevent the tracking of specific values or even discard the initial values. I would like to have a way to analyze the dependencies of the variables or the MemRegions among each other in parallel to the analysis performed.

Now to my questions:

Is there a way to get the information from which MemRegion the value of a ConcreteInt comes from? Or is there a completely different way to get the desired information?

I would be very happy about any hints!

0

There are 0 best solutions below