How can we detect all pointer comparisons in source code? c++

482 Views Asked by At

We want to find all pointer comparisons from a class type. For example we have a class A and derived classes from A like B, C ect.

A *pa;
A *pa2;
B *pb;

All comparisons like if (pa == pa2) or if (pa != pb) must be found in our source code.

I know that we can use a CLang analyzer to find those comparisons but our source code is not CLang compatible. We are using visual studio 2015.

Please don't give a solution like; remove class A from source code then try compile it so find all usages from class A where it doesn't compiles.

Has anybody a solution to find it? A tool like CppCheck (that checks for possible errors) or Visual Studio extension?

Edit:

Does anybody know, how can i find all comparisons in my code with CppDepend/CQLinq syntax? It could also help me. CppDepend uses CLang but it continue to parse if it has parsing errors.

3

There are 3 best solutions below

0
On BEST ANSWER

My solution is: (as @M.M said) wrapping pointers with a template wrapper class, that implements operator overloads like -> * (so less compile problems) and deletes compare operators like == != (so find the comparisons with compile errors). Replacing all pointers can be done with the regular expression. (A * with A_Wrapper)

What i also found is, using a pointer in a map is like a pointer comparison. If you use pointers in a map, you should also delete <-Operator in your wrapper class.

Of course, i had compile errors, but those errors were not difficult to solve. And it seems that is certain solution.

I hope that helps somebody.

5
On

I'm guessing I'm missing the question. So you just want to find all instances of any pointer pointing to type A, B, C, etc where the pointer is used in a conditional expression by comparison...

So you know all the type names then. That means there are a finite number of types and a finite number of comparisons e.g. == != <= >= < > Right?

So for every instance of a pointer created by all types, build a table. That gives you the coded name of every pointer you are looking for.

fred *myfred, *yourfred, *thefred;

account *primaryacct, *secondacct; ... and so on...

Your table would be:
myfred
yourfred
thefred
primaryacct
secondacct

Now for every instance of each one - starting with the first 'myfred' find myfred followed by == then by != and so on (absorbing any spaces), when you find the first (left side of the comaprison e.g.

secondacct<=

then get the right side) and compare it to every pointer coded name in the table you built. When you have a match like myfred!=primaryacct your do what you like with it. Let's just say for arguments sake, you wanted to perform a global search and replace for a given comparison, or a list of comparisons, you could do this on the go by opening an additional file for output, and as you read in and find each occurrence you could output it in your favor to the new source code file.

Basically just find every comparison, look on each side of it and see if both sides have a coded name that is in your table. Essentially you are just parsing the code using the same table of identifiers on either side of a finite combination of comparison strings - again they are: [ == != <= >= < > ]

I don't know of a software tool that does this, but you could just code this pretty quick. It would serve only this one purpose of course, but it would get the job done fast.

I'm assuming of course that your source code is in text form and that you can fopen the file(s) and read it in to do this. If so, then you could have the result however you like, for instance a list for each file and line the occurrence is found for each expressed comparison.

To grab a whole line of code when reading in -

In C - just use fgets

In C++ - use getline

Then just parse that buffer you read in with the logic described above.

-------------- EDITED --------------- regarding comments below

@YusufRamazanKaragöz - Oak - I apologize for the over-generalization. Any chance you might provide a code sample that includes a few of those questions - like, what if it covers multiple lines? I was basing my thought process on what you wrote "All comparisons like if (pa == pa2) or if (pa != pb) must be found in our source code" and nothing more so I didn't expand into function returns, etc. As for building the table - you know the types correct? So for every line that has a variable declared of those types is how you build it. For instance if I wanted a table of every char defined variable in every line of code for all files of the program - I would search all lines for the word char. Then after that line I would look for comma separated strings until no commas or a semicolon (which can go on to the next line, so use fgetc instead for fgets). Some of those declarations would be outright, some might be *char, some char[] - etc. I would then have a list of every variable of type char. I mean, if when you perform a search for the type name you are talking about, can't you see the line it occurs on, and everything declared after it? If you can, then you can build the index table. Or is there some reason I'm not getting why this can't be done? Finding casted values creates another set of parsing rules altogether and further complicates the task as does template to object comparisons. I didn't truly comprehend your dilemma from the original question until now. I truly want to help, but perhaps a block of code that covers every paradigm of parsing would help me determine if I can. Actually, if you could give me an idea on why you want to do this at all would lend me to a better thinking process. Do you want to globally change something? I will of course defer to you decision and stop trying if you think the effort is in vain. Thank you for your time a patience however and I hope you find a solution.

4
On

Our DMS Software Reengineering Toolkit with its C++14 front end could be used to do this.

DMS is general purpose program analysis and transformation machinery that can be customized to achieved a desired effect on a programming langauge provided to it as a plug-in module. Its C++14 front end configurably handles pure ANSI, GCC/Clang-style syntax, or Visual Studio syntax. It includes a complete preprocessor.

To accomplish OP's purpose, one would configure DMS to:

  1. parse the compilation units, which produces an AST.
  2. for each compilation unit, perform name and type resolution. This builds symbol tables containing type information, and provides a basis for computing the types of arbitrary expressions. This capability is built into DMS's C++ front end.
  3. crawl the AST, looking for operators == and !=
  4. ask DMS to compute the type of the right and left hand side subexpressions
  5. Verify that type was the targeted class, or one that inherits from the targeted class. (Presumably the targeted class is identified as being defined at a certain source file/line position; this can be found by searching the symbol table. Checking if a type is derived from another is simply a matter of recursive searching the possibly multiple parent links recorded for a symbol table links to check if a parent is the desired target type).
  6. Report the file name, source line and column of the operator.

Each of the above steps is supported pretty directly by the machinery/APIs provided by DMS and the C++14 front end. This probably takes a couple of pages of custom code added to DMS to achieve the effect.