Writing a Haskell lexer that matches strings from CSV file

135 Views Asked by At

I am trying to write a simple lexer that recognises words such as prepositions. I have lists of these words in CSV format. At the moment I have a lexer that works but I am having to type out each string from my list individually e.g.:

...
("before",rest)  -> TokenPreposition : lexer rest
("behind",rest)  -> TokenPreposition : lexer rest
...

Is it possible to read the words in from the CSV files? I know there is a library for parsing CSV files but I wouldn't know how to continue after this?

1

There are 1 best solutions below

0
On BEST ANSWER

You can use a Set String to store a word list and the use the member function to determine if a word is in a set.

Here is some example code. The input to lexer are lists of verbs, nouns and prepositions and a list of words which it then classifies according to which list the word is in.

import qualified Data.Set as S

data Speech = Verb | Noun | Preposition | Other

-- classify a single word    
classify :: S.Set String -> S.Set String -> S.Set String -> String -> Speech
classify verbs nouns preps word
  | S.member word verbs = Verb
  | S.member word nouns = Noun
  | S.member word preps = Preposition
  | otherwise           = Other

lexer :: [String] -> [String] -> [String] -> [String] -> [Speech]
lexer vlist nlist plist words =
  let nouns = S.fromList nlist  -- convert each word list into a set
      verbs = S.fromList vlist
      preps = S.fromList plist
  in map (classify verbs nouns preps) words