Creating a list of global variables from C++ source file

3.9k Views Asked by At

I'm now working on problem, its statement - generate text file with list of all declared global variables in .CPP file.

I came up to several ideas, first one:

Try to use ctags, so I wrote some short script:

while read line
echo $line
printf "%s" $line >> report.txt
ctags -x --c++-kinds=v --file-scope=no "{$line}" | sort | sed "/const/d" | awk '{printf " %s", $1}' >> report.txt
printf "\n" >> report.txt
done < cpp_source_file_list.txt

This piece of code gets filename of .cpp source file from cpp_source_file_list.txt, scans it for global variables (ignoring const) and write report "filename [list of variables]. The main problem I've encountered is that ctags acts very strange ignoring in some cases STL types.

E.g it can exclude line ike "vector v;", but include "std::vector v;".

Are there any ways to fix such issue? Trying to use ctags -I ./id.txt additional key and make manually list of identifiers to override, but it brings also incorrect results.

The second way:

Use nm command, like:

nm builtsource.o | grep '[0-9A-Fa-f]* [BCDGRS]'

But in this case I recieve unnecessary information, like:

0000000000603528 B M 
0000000000603548 B N 
0000000000603578 B _ZSt3cin@@GLIBCXX_3.4 <- (!)
0000000000603579 B _ZSt4cout@@GLIBCXX_3.4 <- (!)
0000000000603748 B t 

And now I have no idea how to imporve one of these methods to recieve correct information about the list of declared global variables from arbitrary .cpp source file. I would be gladful to hear any suggestion on this problem.


There are 3 best solutions below


Another possibility would be to develop a GCC plugin or a MELT extension for that precise purpose. You'll need to understand some of the details of GCC internal representations (Gimple and Tree).

The advantage of customizing GCC (with a plugin in C or an extension in MELT) is that you work on the exact compiler internals (after preprocessing and parsing). However, this will take you some effort.


You might consider using GCC-XML, probably with something else on top (like pygccxml) to make things easier to navigate. I've successfully used this combination for similar code extraction purposes.


You may be able to leverage Doxygen to implement this. Doxygen can parse a C++ file and generate an XML file that captures all of the variables encountered in the file. Specifically, if you set the following configuration options:


Given an input file like:

#include <vector>

using namespace std;

std::vector<int> s1;
vector s2;

You can generate an output doxygen.tag file with the following content:

<?xml version='1.0' encoding='ISO-8859-1' standalone='yes' ?>
  <compound kind="file">
    <member kind="variable">
      <type>std::vector&lt; int &gt;</type>
    <member kind="variable">
  <compound kind="namespace">

Once you have the XML file, you should be able to extract out the information you're looking for.