Semantic Analysis For Simple Compile-To-C Language

342 Views Asked by At

So I'm working on creating a simple, compile-to-C language that has syntax similar to Python. Here is some sample source code:

# All comments start with pound signs

# Integer declaration
speed = 4
motor = 69.5
text = "hey +  guys!"
junk =   5    +4

# Move function
def move():
  speed = speed + 1
  print speed

# Main function (program entry)
def main():
  localvar = 43.2
  move()
  if true:
    print localvar

Like Python, the language emphasizes readability by indentation policies. It also has a very loose type declaration system. Types are determined by the context.

object = 5            // Creates an integer
object_two = "stuff"  // Creates a string
object_three = 5.23   // Creates a float

The sample source code I have above is internally represented as such:

[
  [
    "GLOBAL",
    [
      "speed = 4",
      "motor = 69.5",
      "text = \"hey +  guys!\"",
      "junk =   5    +4"
    ],
    [
      "SCOPE",
      [
        "speed",
        "motor",
        "text",
        "junk"
      ],
      [
        "INT",
        "FLOAT",
        "STRING",
        "INT"
      ],
      [
        0,
        1,
        2,
        3
      ]
    ]
  ],
  [
    "def move():",
    [
      "  speed = speed + 1",
      "  print speed"
    ],
    [
      "SCOPE",
      [
        "speed",
        "motor",
        "text",
        "junk"
      ],
      [
        "GLOBAL",
        "GLOBAL",
        "GLOBAL",
        "GLOBAL"
      ],
      [
        0,
        1,
        2,
        3
      ]
    ]
  ],
  [
    "def main():",
    [
      "  localvar = 43.2",
      "  move()",
      "  if true:",
      "    print localvar"
    ],
    [
      "SCOPE",
      [
        "speed",
        "motor",
        "text",
        "junk",
        "localvar"
      ],
      [
        "GLOBAL",
        "GLOBAL",
        "GLOBAL",
        "GLOBAL",
        "FLOAT"
      ],
      [
        0,
        1,
        2,
        3,
        0
      ]
    ]
  ]
]

Every function is packed into this representation along with respective local variables and their types (also the index of the line they are declared on respective to the function).

I'm trying to convert this intermediate representation into actual C code (actually it is NXC code, so it slightly differs from C).

My question is how can I make sense of variable types (particularly the variables declared in a function argument). The only way I can possibly do this is guessing based on the context in which the function was called.

Not to mention, I'm creating the intermediate representation in a linear fashion. What happens if a function is defined but not called until later on? Will I have to do several runs modifying this intermediate representation until I obtain all the necessary type information?

0

There are 0 best solutions below