Given a lexer implemented in FsLexYacc, how do I get all of the tokens?

184 Views Asked by At

I have a lexer and parser implemented in FsLexYacc. To debug the lexer, I would like to print all of the tokens for a given string.

Here is what I have so far:

#load "../.paket/load/net5.0/FsLexYacc.Runtime.fsx"

#load "./Domain.fs"
#load "./Parser.fs"
#load "./Lexer.fs"

open System
open System.IO
open FSharp.Text
open FSharp.Text.Lexing
open Scripting

let allTokens (input : string) =
  let lexBuffer = LexBuffer<char>.FromString input
  Lexer.tokenize lexBuffer // Only gets first token!

printfn "%A" <| allTokens "1 + 1"

NUMBER 1

But this is only the first token!

How do I get all of the tokens as a list or sequence?

1

There are 1 best solutions below

0
On BEST ANSWER

Lexer.tokenize can be called repeatedly to get more tokens.

Usually Your lexer definition can match on eof when it reaches the end of the file, and may return a specific token to indicate "end of file".

let tokenize = parse
    ... 
   | eof -> { Token.EOF }

In that case, you may just call Lexer.tokenize until you receive an EOF token. You can of course do this iteratively, recursively, or by composing builtins.

let allTokens = 
    Seq.initInfinite (fun _ -> Lexer.tokenize lexBuffer)
    |> Seq.takeWhile ( (<>) Token.EOF )

let rec allTokens = 
    match Lexer.tokenize lexBuffer with
    | Token.EOF -> []
    | t -> t :: allTokens