How do I specify an abstract syntax of different language constructs in an already existing language?

78 Views Asked by At

I'm trying to solve a problem of specifying an abstract syntax of different language constructs such as variable declaration, array declaration, if-else statements and so on in Haskell, but I'm confused as to what that means. Is it simply what these look like within Haskell? Below is one of the language constructs I'm tasked with:

  • variable declaration
    • untyped
    • a variable declaration may optionally also specify an initial value (concrete syntax example: var x = 10;)
      • the initial value can only be an integer/boolean/string literal

I tried my hand at the above problem and wrote down: var = int | bool | String

I get rather confused when I'm tasked with the following language constructs:

  • arithmetic expressions
    • primitive expressions:
      • some are taken for granted (i.e., no need to specify): integer literals, boolean literals, string literals
      • using a variable
      • accessing a class member
      • indexing an array element
      • function call
      • (potentially, other primitive expressions may be needed) As I'm unsure how I would even begin with this.

I'm essentially just asking if I'm understanding this correctly as I can't seem to find any examples elsewhere, and if I could get any tips to better help me understand. I'm also not sure if this site is the right place to ask this, as it seems a lot more theoretical than usual, so if this is the wrong place please point out another forum that might be more appropriate. Thanks in advance for any help.

1

There are 1 best solutions below

0
K. A. Buhr On

I believe what they are asking you to do is define Haskell data types that are appropriate for representing the abstract syntax of some "target language". This could be a first step in implementing (in Haskell) an interpreter or compiler for the target language.

So, they don't want you to write down:

var = int | bool | String

because that isn't valid Haskell code.

Instead, you're probably expected to write down something like:

-- Haskell data types for statements
type Name = String
data Stmt = Decl Name (Maybe Lit)
data Lit = IntLit Int | BoolLit Bool | StringLit String

These are valid Haskell data type declarations that define data types that are appropriate for representing variable declaration statements written in the target language. For example, declarations in the target language like these:

// target language code
var x;            // untyped declaration
var number = 42;  // initialization with literal

could be represented as Haskell values using the above data types like so:

-- Haskell values representing target language statements
stmt1 :: Stmt
stmt1 = Decl "x" Nothing

stmt2 :: Stmt
stmt2 = Decl "number" (Just (IntLit 42))

Similarly, if you defined a type for expressions like:

-- Haskell data type for expressions
data Expr
  = LitE Lit
  | VarE String
  | MemberE Expr String
  | ElementE Expr Expr
  | CallE String [Expr]

this would allow you to represent a target language expression like:

// example of a target language expressions
add(x.pos[42],1)

as a Haskell value of type Expr:

-- Haskell value representing a target language expression
expr1 :: Expr
expr1 = CallE "add" [ElementE (MemberE (VarE "x") "pos") (LitE (IntLit 42)), LitE (IntLit 1)]

If this Haskell code doesn't make any sense to you, then you probably aren't ready to tackle this problem and will need to seek out some Haskell tutorials. Take a look at this old answer and the Learning Haskell page on the wiki.

If this Haskell code makes sense to you but you can't imagine how I got from your written specification to these data types, then that's understandable. The best thing to do is to review some examples. Here's a git repository with some simple languages implemented in Haskell, one language per subdirectory. You'll find the concrete syntax of each language described in the README.md in the subdirectory and the associate abstract syntax data types defined at the top of the the Parser.hs module for each language. There's also an example on the Wiki.