Creating a list of global variables from C++ source file

3.9k Views Asked by At

I'm now working on problem, its statement - generate text file with list of all declared global variables in .CPP file.

I came up to several ideas, first one:

Try to use ctags, so I wrote some short script:

while read line
do
echo $line
printf "%s" $line >> report.txt
ctags -x --c++-kinds=v --file-scope=no "{$line}" | sort | sed "/const/d" | awk '{printf " %s", $1}' >> report.txt
printf "\n" >> report.txt
done < cpp_source_file_list.txt

This piece of code gets filename of .cpp source file from cpp_source_file_list.txt, scans it for global variables (ignoring const) and write report "filename [list of variables]. The main problem I've encountered is that ctags acts very strange ignoring in some cases STL types.

E.g it can exclude line ike "vector v;", but include "std::vector v;".

Are there any ways to fix such issue? Trying to use ctags -I ./id.txt additional key and make manually list of identifiers to override, but it brings also incorrect results.

The second way:

Use nm command, like:

nm builtsource.o | grep '[0-9A-Fa-f]* [BCDGRS]'

But in this case I recieve unnecessary information, like:

0000000000603528 B M 
0000000000603548 B N 
0000000000603578 B _ZSt3cin@@GLIBCXX_3.4 <- (!)
0000000000603579 B _ZSt4cout@@GLIBCXX_3.4 <- (!)
0000000000603748 B t 

And now I have no idea how to imporve one of these methods to recieve correct information about the list of declared global variables from arbitrary .cpp source file. I would be gladful to hear any suggestion on this problem.

3

There are 3 best solutions below

0
On

Another possibility would be to develop a GCC plugin or a MELT extension for that precise purpose. You'll need to understand some of the details of GCC internal representations (Gimple and Tree).

The advantage of customizing GCC (with a plugin in C or an extension in MELT) is that you work on the exact compiler internals (after preprocessing and parsing). However, this will take you some effort.

0
On

You might consider using GCC-XML, probably with something else on top (like pygccxml) to make things easier to navigate. I've successfully used this combination for similar code extraction purposes.

2
On

You may be able to leverage Doxygen to implement this. Doxygen can parse a C++ file and generate an XML file that captures all of the variables encountered in the file. Specifically, if you set the following configuration options:

EXTRACT_ALL= YES
GENERATE_TAGFILE= doxygen.tag

Given an input file like:

#include <vector>

using namespace std;

std::vector<int> s1;
vector s2;

You can generate an output doxygen.tag file with the following content:

<?xml version='1.0' encoding='ISO-8859-1' standalone='yes' ?>
<tagfile>
  <compound kind="file">
    <name>input.cpp</name>
    <path>C:/Users/haney/tmp/tmp55/</path>
    <filename>input_8cpp</filename>
    <namespace>std</namespace>
    <member kind="variable">
      <type>std::vector&lt; int &gt;</type>
      <name>s1</name>
      <anchorfile>input_8cpp.html</anchorfile>
      <anchor>93b3bd32f5b6bff31bc4052716ddd444</anchor>
      <arglist></arglist>
    </member>
    <member kind="variable">
      <type>vector</type>
      <name>s2</name>
      <anchorfile>input_8cpp.html</anchorfile>
      <anchor>8feb4a508135e43a72f227568b755a07</anchor>
      <arglist></arglist>
    </member>
  </compound>
  <compound kind="namespace">
    <name>std</name>
    <filename>namespacestd.html</filename>
  </compound>
</tagfile>

Once you have the XML file, you should be able to extract out the information you're looking for.