I am using ANTLR4 to parse C code (.h files), and I wanted to extract the function signatures (function name, function return type, and function parameters) as well as function pointers right now.
In the future, I will try and expand this to structs,enums typedefs etc as well.
I am doing this so that I can check compatibility between C(.h) files.
The problem I am facing right now is that, if there are two declarations in the .h files like this :
int *fp(int,int);
int *(*function2())(int, int);
My code treats these two declarations as the same, even though one is a function pointer, and the other is a function returning a pointer that is used a function pointer, I want to store fp in one dictionary, and function2() in another dictionary.
The code I am using right now is this :
from CListener import CListener
from CParser import CParser
class FileStructure(CListener) :
def __init__(self) :
self.function_signatures= {}
self.function_pointer = {}
def print_definitions(self) :
print("Function Signatures :",self.function_signatures)
print("Function Pointers :",self.function_pointer)
def extractIdentifier(self, directDeclarator):
"""
This function extracts the identifier name from a directDeclarator context.
If the directDeclarator context contains an Identifier, it returns its text.
If the context is wrapped in parentheses, it recursively looks for the Identifier.
"""
# Base case: If the directDeclarator contains an Identifier, return its text
if directDeclarator.directDeclarator():
return directDeclarator.directDeclarator().getText()
# Recursive case: If the directDeclarator is wrapped in parentheses, recurse into it
nestedDeclarator = directDeclarator.declarator()
if nestedDeclarator and nestedDeclarator.directDeclarator():
return self.extractIdentifier(nestedDeclarator.directDeclarator())
# If no identifier is found, return an empty string
return ""
def getParameters(self, parameterTypeListCtx):
"""
This function extracts parameters and their types from a parameterTypeList context.
It iterates over each parameterDeclaration context within the parameterTypeList.
Each parameter's type specifier and name are extracted and added to a list.
"""
parameters = []
parameterListCtx = parameterTypeListCtx.parameterList()
if parameterListCtx:
for parameterDeclarationCtx in parameterListCtx.parameterDeclaration():
# Extract the type specifier(s) for the parameter
type_specifiers = [
token.getText() for token in parameterDeclarationCtx.declarationSpecifiers().children
if not isinstance(token, CParser.TypeQualifierContext)
]
param_type = ' '.join(type_specifiers)
# Extract the parameter name, if it exists
param_name = ''
if parameterDeclarationCtx.declarator():
param_name = parameterDeclarationCtx.declarator().directDeclarator().getText()
parameters.append((param_type, param_name))
return parameters
def enterDeclaration(self, ctx: CParser.DeclarationContext):
if ctx.initDeclaratorList():
for initDeclaration in ctx.initDeclaratorList().initDeclarator():
declarator = initDeclaration.declarator()
directDeclarator = declarator.directDeclarator()
# Handle functions returning a pointer (to data, function, or array)
if declarator.pointer():
# Function returning a pointer to a function
if directDeclarator and directDeclarator.parameterTypeList():
function_name = self.extractIdentifier(directDeclarator)
print("function" , function_name)
self.function_pointer[function_name] = True
else:
# Function returning a pointer (to data or array)
print("else statement")
if directDeclarator and hasattr(directDeclarator, 'directDeclarator'):
# Check for an array type after the pointer
if any(hasattr(child, 'typeQualifierList') or hasattr(child, 'assignmentExpression') for child in directDeclarator.children):
# Function returning a pointer to an array
function_name = self.extractIdentifier(directDeclarator)
self.function_pointer[function_name] = True
else:
# Function returning a pointer to data (not a function or array)
function_name = self.extractIdentifier(directDeclarator)
print(function_name)
self.function_pointer[function_name] = True
else:
# Handle normal function definitions
if directDeclarator and directDeclarator.parameterTypeList():
function_name = directDeclarator.directDeclarator().getText()
print(function_name)
return_type = ctx.declarationSpecifiers().getText()
parameter_list = self.getParameters(directDeclarator.parameterTypeList())
self.function_signatures[function_name] = (return_type, parameter_list)
However the line
# Function returning a pointer (to data or array)
print("else statement")
is not getting executed, and the pointers are getting treated as functions.
I am using the grammar that is provided here.
And functions like void function5(int arr[5]); are treated as normal functions, however functions like int (*function4())[5]; are not recognized at all.
Is there anyway I can store fp in the function_pointer dict and function2() in the function_signature dictionary?
So far I am treating both of these the same way, by storing all function pointers in the function_pointer dictionary, and then doing a string check with other .h files, which is a naive checking mechanism, and to improve this I was thinking of changing the dictionaries, as I can then club normal function checks along with the function returning pointers.