Creating a simple lexical analyser in Java

3.5k Views Asked by At

I am creating a lexical analyser that must read a text input and output tokens for a basic 'created' language and should output a token when called. I would like it to distinguish between identifiers, constants etc.. from a list of which I pre-determine.

I need to read the text file using an input stream. A while loop will loop through chars individually but I need it to recognise if the chars scanned are an identifier or a '+' '-' '*' '/' etc... what would be the best way to do this?

I am fairly new to programming so any advice on how to construct this would be appreciated. many thanks for any answers

2

There are 2 best solutions below

1
On

The StreamTokenizer class will probably help you out the most. It will read and distinguish between identifiers, numbers, and strings. You can also configure it to identify operators, such as +, *, etc.

0
On

Do not try to write your own lexer / parser.

It is easier to use a lexer/parser generator like ANTLR or SableCC.