Writing a parser for a matrix-like input with Boost Spirit

727 Views Asked by At

I'm trying to write a parser that is able to take an input of the form MATRIX.{variableName} = [1,2,3;4,5,6] , where the representation of the matrix (a 2x3 matrix in this case) is somewhat like MATLAB's format (semicolon indicating new row).

The initial idea was to save the input in a 2d std vector for further processing of the data. This is my first time writing a parser and I'm somewhat clueless about the Spirit framework.

My current (not so intuitive) solution is for the input to be something like MATRIX (2,3) = [1,2,3,4,5,6] to represent the same matrix as above and saving the data in a one-dimensional vector and making use of row and column data to process it later (I believe somewhat like Eigen's implementation of dynamic matrices).

namespace client
{
    namespace qi = boost::spirit::qi;
    namespace ascii = boost::spirit::ascii;
    namespace phoenix = boost::phoenix;
    namespace fusion = boost::fusion;

    template <typename Iterator>
    bool parse_matrix(Iterator first, Iterator last, unsigned& rows, unsigned& cols, std::vector<double>& vals)
    {
        using qi::double_;
        using qi::uint_;
        using qi::_1;
        using qi::lit;
        using qi::phrase_parse;
        using ascii::space;
        using phoenix::push_back;

        double rN = 0.0;
        double iN = 0.0;
        unsigned i=0;
        rows = 0, cols = 0;
        bool r = phrase_parse(first, last,

            //  Begin grammar
            (
                lit("MATRIX") >> '(' >> uint_[phoenix::ref(rows) = _1] >> ',' >> uint_[phoenix::ref(cols) = _1] >> ')' >> '='
                 >> '[' >> double_[push_back(phoenix::ref(vals),_1)]
                        >> *(',' >> double_[push_back(phoenix::ref(vals),_1)]) >> ']'
                // |   double_[ref(rN) = _1]
            ),
            //  End grammar

            space);

        if (!r || first != last) // fail if we did not get a full match
            return false;
        if (vals.size() != (rows*cols)) 
            return false;
        // c = std::complex<double>(rN, iN);
        return r;
    }
}

I was thinking maybe it'd be possible to call functions like appending a std::vector<double> to the std::vector<std::vector<double> > when certain chars (like semicolon) are being parsed. Is this possible? Or how do I actually go about implementing my initial idea?

2

There are 2 best solutions below

0
On BEST ANSWER

I'd suggest:

  • Not using semantic actions for the attribute propagation. You could use it to add validation criteria (see Boost Spirit: "Semantic actions are evil"?)

  • Use automatic attribute propagation so you don't have to pass references around

  • Not validating during parsing unless you have pressing reasons to do so.

A minimal viable parser then becomes:

Live On Coliru

#include <boost/spirit/include/qi.hpp>

using It  = boost::spirit::istream_iterator;
using Row = std::vector<double>;
using Mat = std::vector<Row>;

int main() {
    It f(std::cin>>std::noskipws), l;

    Mat matrix;
    std::string name;

    {
        using namespace boost::spirit::qi;
        rule<It, std::string()> varname_ = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");

        if (phrase_parse(f, l, 
                lit("MATRIX") >> '.' >> '{' >> varname_ >> '}' >> '=' 
                    >> '[' >> (int_ % ',' % ';') >> ']',
                space, name, matrix))
        {
            std::cout << "Parsed: variabled named '" << name << "' [";

            for(auto& row : matrix)
                std::copy(row.begin(), row.end(), std::ostream_iterator<double>(std::cout<<"\n\t",", "));
            std::cout << "\n]\n";
        } else {
            std::cout << "Parse failed\n";
        }
    }

    if (f!=l)
        std::cout << "Remaining input: '" << std::string(f,l) << "'\n";
}

Which can be seen printing the following output for input "MATRIX.{variable_name}=[1,2,3;4,5,6]":

Parsed: variabled named 'variable_name' [
    1, 2, 3, 
    4, 5, 6, 
]

If you want to catch inconsistent row lengths early on, see e.g. this answer:

0
On

You need to break down your expressions to something like this:

rows: DOUBLE | rows ',' DOUBLE

columns: rows | rows ';' rows

matrix: IDENTIFIER '.' '{' IDENTIFIER '}' '=' '[' columns ']'

Now you can have a vector<double> for your rows. A vector< vector<double> > for your columns, and save the result in your matrix.

Of course, when you get a columns in your matrix, you should check that all the rows have the same size. It is not required, but obviously a matrix such as [1,2;3,4,5] is not valid, even though the grammar allows it.