Building a Netbeans lexer for ANTLR 4.5 and String Template 4 grammars

Question

Building a Netbeans lexer for ANTLR 4.5 and String Template 4 grammars

282 Views Asked by Cory Edwards At 17 April 2018 at 14:28

I have been trying to build a Lexer in Netbeans 8.2 to have the correct syntax for String Template v4 files. The problem is that Netbeans modules only have access to ANTLR 3.3 or 4.5 libraries, and I cannot find any String template v4, either pre compiled or needing to be compiled, Lexer grammar files to work with.

Does anyone know of any String Template v4 grammar files that can be built with ANTLR 3.3 or 4.5? I have attached my code and sample that are working with ANTLR 4.5 and this set of grammars.

EDIT 1:

Someone in the comments asked for some examples and what I have tried so far.I have tried using the String template Lexer from here. I took the code from the compile section of the source and it seemed to work ok, but it was very buggy since it was built with ANTLR 3.5 and I was trying to have it run with ANTRL 3.3. The syntax coloring worked, but it was having trouble reading files and sometimes outright failed.

My second attempt was with this set of grammars. They load all of the files properly since I could compile it with 4.5, but the syntax coloring is completely off. It looks like the grammars are more so built for Java than String Template v4.

Here is the code I currently am working with to build the Lexer in the Netbeans module:

Language Provider:

import org.netbeans.api.lexer.InputAttributes;
import org.netbeans.api.lexer.Language;
import org.netbeans.api.lexer.LanguagePath;
import org.netbeans.api.lexer.Token;
import org.netbeans.spi.lexer.LanguageEmbedding;
import org.netbeans.spi.lexer.LanguageProvider;

/**
 *
 * @author Cory
 */
@org.openide.util.lookup.ServiceProvider(service=org.netbeans.spi.lexer.LanguageProvider.class)
public class stLanguageProvider extends LanguageProvider {

    @Override
    public Language<?> findLanguage(String mimeType) {
        if ("text/x-st".equals(mimeType)){
            return new stLanguageHierarchy().language();
        }

        return null;
    }

    @Override
    public LanguageEmbedding<?> findLanguageEmbedding(Token<?> token, LanguagePath languagePath, InputAttributes inputAttributes) {
        return null;
    }

}

Language Hierarchy:

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.netbeans.api.lexer.Language;
import org.netbeans.spi.lexer.LanguageHierarchy;
import org.netbeans.spi.lexer.Lexer;
import org.netbeans.spi.lexer.LexerRestartInfo;

/**
 *
 * @author Cory
 */
public class stLanguageHierarchy extends LanguageHierarchy<stTokenId> {

    private static List<stTokenId> tokens = new ArrayList<stTokenId>();
    private static Map<Integer, stTokenId> idToToken = new HashMap<Integer, stTokenId>();

    private static final Language<stTokenId> language = new stLanguageHierarchy().language();

    public static Language<stTokenId> getLanguage() {
        return language;
    }

    static {
        TokenType[] tokenTypes = TokenType.values();
        for (TokenType tokenType : tokenTypes) {
            tokens.add(new stTokenId(tokenType.name(), tokenType.category, tokenType.id));
        }
        for (stTokenId token : tokens) {
            idToToken.put(token.ordinal(), token);
        }
    }

    /**
     * Returns an actual stTokenId from an id. This essentially allows
     * the syntax highlighter to decide the color of specific words.
     * @param id
     * @return
     */
    static synchronized stTokenId getToken(int id) {
        return idToToken.get(id);
    }

    @Override
    protected Collection<stTokenId> createTokenIds() {
        return tokens;
    }

    @Override
    protected Lexer<stTokenId> createLexer(LexerRestartInfo<stTokenId> lri) {
        return new stEditorLexer(lri);
    }

    @Override
    protected String mimeType() {
        return "text/x-st";
    }
}

The Editor lexer: This is where the String Template Lexer should be brought in if I can find it. This code here is using the lexer from ANTLR 4.5 build from the second link.

import org.antlr.parser.st4.STLexer;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.Token;
import org.netbeans.spi.lexer.Lexer;
import org.netbeans.spi.lexer.LexerInput;
import org.netbeans.spi.lexer.LexerRestartInfo;
import org.openide.util.Exceptions;

/**
 *
 * @author Cory
 */
public class stEditorLexer implements Lexer<stTokenId> {

    private static final String SOURCE_NAME = "stEditor";

    private LexerRestartInfo<stTokenId> lri;
    private STLexer lexer;

    public stEditorLexer(LexerRestartInfo<stTokenId> lri) {
        this.lri = lri;
        try
        {
            LexerInput lexerInput = lri.input();
            AntlrCharStream charStream = new AntlrCharStream(lexerInput, SOURCE_NAME);
            lexer = new STLexer(charStream);
            lexer.setDelimiters('$', '$');
            lexer.setChannel(STLexer.OFF_CHANNEL);
            lexer.reset();
            AntlrLexerState state = (AntlrLexerState) lri.state();

            if (state != null) {
                state.apply(lexer);
            }
        }
        catch(Exception ex)
        {
            Exceptions.printStackTrace(ex);
        }
    }

    /**
     * @return the next token recognized by the lexer or null if there are no
     *         more characters (available in the input) to be tokenized.
     */
    @Override
    public org.netbeans.api.lexer.Token<stTokenId> nextToken() {
        Token token = lexer.nextToken();

        stTokenId tokenId = null;

        if (token.getType() != LexerInput.EOF) {
            tokenId  = stLanguageHierarchy.getToken(token.getType());
            System.out.println("nextToken - " + tokenId.toString() + " " + token.getText());
        }  else if (lri.input().readLength() > 0) {
            // Remaining chars on the input should be tokenized
            // see https://netbeans.org/bugzilla/show_bug.cgi?id=240826
            tokenId = stLanguageHierarchy.getToken(STLexer.HORZ_WS);
            System.out.println("nextToken - ERROR (as WS)");
        }

        if (tokenId == null) {
            System.out.println("nextToken - EOF");
            return null;
        }

        // According to the method specification, this must *not* return any
        // other Token instances than those obtained from the TokenFactory.
        return lri.tokenFactory().createToken(tokenId);
    }

    @Override
    public Object state() {
        return new AntlrLexerState(lexer._mode, lexer._modeStack);
    }

    @Override
    public void release() {
    }
}

The TokenType class I generated from the Tokens from the second link as well:

import org.netbeans.api.lexer.Language;
import org.netbeans.api.lexer.TokenId;

/**
 *
 * @author Cory
 */
public enum TokenType implements TokenId {

    DOC_COMMENT(1, "other"),
    BLOCK_COMMENT(2, "comment"),
    LINE_COMMENT(3, "comment"),
    TMPL_COMMENT(4, "comment"),
    HORZ_WS(5, "other"),
    VERT_WS(6, "other"),
    ESCAPE(7, "separator"),
    LDELIM(8, "specialCharacter"),
    RBRACE(9, "separator"),
    TEXT(10, "text"),
    LBRACE(11, "separator"),
    RDELIM(12, "specialCharacter"),
    STRING(13, "string"),
    IF(14, "keyword"),
    ELSEIF(15, "keyword"),
    ELSE(16, "keyword"),
    ENDIF(17, "keyword"),
    SUPER(18, "keyword"),
    END(19, "keyword"),
    TRUE(20, "keyword"),
    FALSE(21, "keyword"),
    AT(22, "keyword"),
    ELLIPSIS(23, "keyword"),
    DOT(24, "separator"),
    COMMA(25, "separator"),
    COLON(26, "separator"),
    SEMI(27, "separator"),
    AND(28, "keyword"),
    OR(29, "keyword"),
    LPAREN(30, "separator"),
    RPAREN(31, "separator"),
    LBRACK(32, "separator"),
    RBRACK(33, "separator"),
    EQUALS(34, "separator"),
    BANG(35, "separator"),
    ERR_CHAR(36, "keyword"),
    ID(37, "keyword"),
    PIPE(38, "separator");

    public int id;
    public String category;
    public String text;

    private static final Language<stTokenId> langauge = new stLanguageHierarchy().language();

    private TokenType(int id, String category) {
        this.id = id;
        this.category = category;
    }

    public static TokenType valueOf(int id) {
        TokenType[] values = values();
        for (TokenType value : values) {
            if (value.id == id) {
                return value;
            }
        }
        throw new IllegalArgumentException("The id " + id + " is not recognized");
    }

    @Override
    public String primaryCategory() {
        return category;
    }

    public static final Language<stTokenId> getLanguage() {
        return langauge;
    }
}

Everything else such as the AntlrCharStream is code that I found to work with the ANTLR 4.5 library.

EDIT 2 I thought adding some sample output would help to try and figure out what could be the problem.

This is the color scheme:

<fontcolor name="keyword"           foreColor="blue" default="default"/>
<fontcolor name="comment"           foreColor="green" default="default"/>    
<fontcolor name="other"             foreColor="gray" default="default"/>    
<fontcolor name="specialCharacter"  foreColor="orange" default="default"/> 
<fontcolor name="text"              foreColor="pink" default="default"/> 
<fontcolor name="string"            foreColor="red" default="default"/> 
<fontcolor name="separator"         foreColor="magenta" default="default"/>

Original Q&A

Building a Netbeans lexer for ANTLR 4.5 and String Template 4 grammars

There are 0 best solutions below

Related Questions in NETBEANS

Related Questions in ANTLR4

Related Questions in LEXER

Related Questions in STRINGTEMPLATE-4

Trending Questions

Popular # Hahtags

Popular Questions