Slow performance using boost xpressive

Question

Slow performance using boost xpressive

401 Views Asked by Pablo At 13 March 2018 at 17:46

Lately I have being using boost xpressive for parsing files. These files are 10 MB each and there will be several hundred of them to parse.

Xpressive is nice to work and clear syntax, but the problems comes with performance. It is incredible how it crawls in debug versions, while in release version it spends more than a whole second per file. I have tested against old plain get_line(), find() and sscanf() code, and it can beat xpressive easily.

I understand that type checking, backtracking and so have a cost, but this seems excessive to me. How I wonder, I am doing something wrong? Is it any way of optimizing this to run at a decent pace? Should it deserve the effort to migrate code to boost::spirit?

I have prepared a lite version of code with a few lines of a real file embedded in case someone might test and help.

NOTE- As a requirement, VS 2010 must be used (not fully c++11 compliant unfortunately)

#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>

const char input[] = "[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: BTN-1002 - Km: 90.0 - SWITCH_ON: 1\n\
[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - SLOPE: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - GEAR: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - GEAR: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - CLUTCH: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.540 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8\n\
[2018-Mar-13 13:14:01.819966] - 3.110 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15\n\
[2018-Mar-13 13:14:02.829983] - 3.450 s => Driver: 0 - Speed: 0.6 - Road: B-302 - Km: 90.1 - SLOPE: -5\n\
[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21\n\
[2018-Mar-13 13:14:03.250451] - 3.870 s => Driver: 0 - Speed: 1.2 - Road: B-302 - Km: 90.2 - GEAR: 2\n\
[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29\n\
[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34\n\
[2018-Mar-13 13:14:03.880066] - 4.500 s => Driver: 0 - Speed: 2.8 - Road: B-302 - Km: 90.5 - CLUTCH: 1\n\
[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45\n\
[2018-Mar-13 13:14:04.300160] - 4.920 s => Driver: 0 - Speed: 4.2 - Road: B-302 - Km: 90.9 - SLOPE: 10\n\
[2018-Mar-13 13:14:04.510025] - 5.130 s => Driver: 0 - Speed: 4.9 - Road: B-302 - Km: 91.1 - GEAR: 3";

const auto len = std::distance(std::begin(input), std::end(input));

struct Sequence
{
    int ms;
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    std::string date;
    std::string road;
};

namespace xp = boost::xpressive;

int main()
{
    Sequence data;
    std::vector<Sequence> sequences;

    using namespace xp;

    cregex real = (+_d >> '.' >> +_d);
    cregex keyword = " - SEQUENCE: " >> (+_d)[xp::ref(data.sequence) = as<int>(_)];
    cregex date = repeat<4>(_d) >> '-' >> repeat<3>(alpha) >> '-' >> repeat<2>(_d) >> _s >> repeat<2>(_d) >> ':' >> repeat<2>(_d) >> ':' >> repeat<2>(_d);

    cregex header = '[' >> date[xp::ref(data.date) = _] >> '.' >> (+_d)[xp::ref(data.ms) = as<int>(_)] >> "] - "
                    >> real[xp::ref(data.time) = as<double>(_)]
                    >> " s => Driver: " >> (+_d)[xp::ref(data.driver) = as<int>(_)]
                    >> " - Speed: " >> real[xp::ref(data.vel) = as<double>(_)]
                    >> " - Road: " >> (+set[alnum | '-'])[xp::ref(data.road) = _]
                    >> " - Km: " >> real[xp::ref(data.km) = as<double>(_)];

    xp::cregex parser = (header >> keyword >> _ln);

    xp::cregex_iterator cur(input, input + len, parser);
    xp::cregex_iterator end;

    for (; cur != end; ++cur)
        sequences.emplace_back(data);

    return 0;
}

Please, mind the VS 2010 constraint.

Original Q&A

There are 2 best solutions below

sehe On 16 March 2018 at 02:41

You can use fusion with spirit traits (see eg parsing into several vector members), but I'd consider using semantic actions.

Here's the design conundrum:

Separate `vector`s with a trait

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

struct BaseEvent {
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    boost::string_view date;
    boost::string_view road;
};
struct Sequence : BaseEvent{};
struct Clutch : BaseEvent{};
struct Gear : BaseEvent{};

BOOST_FUSION_ADAPT_STRUCT(::Sequence, date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(::Clutch, date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(::Gear, date, time, driver, vel, road, km, sequence)

struct LogEvents {
    std::vector<Sequence> sequence;
    std::vector<Clutch> clutch;
    std::vector<Gear> gear;

    void add(Sequence const& s) { sequence.push_back(s); }
    void add(Clutch   const& c) { clutch.push_back(c);   }
    void add(Gear     const& g) { gear.push_back(g);     }
};

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };

    template <> struct is_container<LogEvents> : std::true_type {};

    template <> struct container_value<LogEvents> {
        using type = boost::variant<::Sequence, ::Clutch, ::Gear>;
    };

    template <typename T> struct push_back_container<LogEvents, T> {
        struct Visitor {
            LogEvents& _log;
            template <typename U> void operator()(U const& ev) const { _log.add(ev); }
            using result_type = void;
        };

        template <typename... U>
        static bool call(LogEvents& log, boost::variant<U...> const& attribute) {
            boost::apply_visitor(Visitor{log}, attribute);
            return true;
        }
    };
} } }


namespace QiParsers {
    template <typename It, typename Attribute>
    struct BaseEventParser : qi::grammar<It, Attribute()> {
        BaseEventParser(std::string const& event_type) : BaseEventParser::base_type(start) {
            using namespace qi;
            auto date_time = copy(
                    repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                    repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

            start 
                = '[' >> raw[date_time] >> "] - "
                >> double_ >> " s"
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> lit(event_type) >> ": " >> int_
                >> (eol|eoi);
        }

      private:
        qi::rule<It, Attribute()> start;
    };
}

LogEvents parse_spirit(It b, It e) {
    QiParsers::BaseEventParser<It, ::Sequence> sequence("SEQUENCE");
    QiParsers::BaseEventParser<It, ::Clutch>   clutch("CLUTCH");
    QiParsers::BaseEventParser<It, ::Gear>     gear("GEAR");

    LogEvents events;
    assert(parse(b, e, *boost::spirit::repository::qi::seek[sequence|clutch|gear], events));

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

int main() {
    auto events = parse_spirit(input, input+len);
    std::cout << "Events: "
        << events.sequence.size() << " sequence, "
        << events.clutch.size() << " clutch, "
        << events.gear.size() << " gear events\n";

    using boost::fusion::operator<<;
    for (auto& s : events.sequence) { std::cout << "SEQUENCE: " <<  s << "\n"; }
    for (auto& c : events.clutch)   { std::cout << "CLUTCH:   " <<  c << "\n"; }
    for (auto& g : events.gear)     { std::cout << "GEAR:     " <<  g << "\n"; }
}

Flipping It Around: 1 `vector<variant<>>`

Wouldn't it make more sense to have a vector of variants instead?

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

namespace MyEvents {
    struct BaseEvent {
        int driver;
        int sequence;
        double time;
        double vel;
        double km;
        boost::string_view date;
        boost::string_view road;
    };
    struct Sequence : BaseEvent{};
    struct Clutch : BaseEvent{};
    struct Gear : BaseEvent{};

    using LogEvent = boost::variant<Sequence, Clutch, Gear>;
    using LogEvents = std::vector<LogEvent>;
}

BOOST_FUSION_ADAPT_STRUCT(MyEvents::Sequence, date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::Clutch,   date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::Gear,     date, time, driver, vel, road, km, sequence)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

namespace QiParsers {
    template <typename It, typename Attribute>
    struct BaseEventParser : qi::grammar<It, Attribute()> {
        BaseEventParser(std::string const& event_type) : BaseEventParser::base_type(start) {
            using namespace qi;
            auto date_time = copy(
                    repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                    repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

            start 
                = '[' >> raw[date_time] >> "] - "
                >> double_ >> " s"
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> lit(event_type) >> ": " >> int_
                >> (eol|eoi);
        }

      private:
        qi::rule<It, Attribute()> start;
    };

    template <typename It>
    struct LogParser : qi::grammar<It, MyEvents::LogEvents()> {
        LogParser() : LogParser::base_type(start) {
            using namespace qi;
            using boost::spirit::repository::qi::seek;

            event = sequence | clutch | gear ; // TODO add types
            start = *seek[event];
        }

      private:
        qi::rule<It, MyEvents::LogEvents()> start;
        qi::rule<It, MyEvents::LogEvent()> event;
        BaseEventParser<It, MyEvents::Sequence> sequence{"SEQUENCE"};
        BaseEventParser<It, MyEvents::Clutch>   clutch{"CLUTCH"};
        BaseEventParser<It, MyEvents::Gear>     gear{"GEAR"};
    };
}

MyEvents::LogEvents parse_spirit(It b, It e) {
    static QiParsers::LogParser<It> const parser {};

    MyEvents::LogEvents events;
    parse(b, e, parser, events);

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

namespace MyEvents { // for debug/demo
    using boost::fusion::operator<<;
    static inline char const* kind(Sequence const&) { return "SEQUENCE"; }
    static inline char const* kind(Clutch   const&) { return "CLUTCH"; }
    static inline char const* kind(Gear     const&) { return "GEAR"; }

    struct KindVisitor : boost::static_visitor<char const*> {
        template <typename T> char const* operator()(T const& ev) const { return kind(ev); }
    };
    static inline char const* kind(LogEvent const& ev) {
        return boost::apply_visitor(KindVisitor{}, ev);
    }
}

int main() {
    auto events = parse_spirit(input, input+len);
    std::cout << "Parsed: " << events.size() << " events\n";

    for (auto& e : events)
        std::cout << kind(e) << ": " << e << "\n"; 
}

Generalize: Common Fields and Other Events

Especially if you then continue on to generalize:

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

namespace MyEvents {
    enum class Kind { Sequence, Clutch, Gear, Slope, Other };

    struct CommonFields {
        boost::string_view date;
        double duration;
    };

    struct BaseEvent {
        CommonFields common;
        int driver;
        int event_id;
        double vel;
        double km;
        boost::string_view road;
        Kind kind;
    };

    struct OtherEvent {
        CommonFields common;
        std::string message;
    };

    using LogEvent = boost::variant<BaseEvent, OtherEvent>;
    using LogEvents = std::vector<LogEvent>;
}

BOOST_FUSION_ADAPT_STRUCT(MyEvents::CommonFields, date, duration)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::BaseEvent, common, driver, vel, road, km, kind, event_id)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::OtherEvent, common, message)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

namespace QiParsers {
    template <typename It>
    struct LogParser : qi::grammar<It, MyEvents::LogEvents()> {
        using Kind = MyEvents::Kind;

        LogParser() : LogParser::base_type(start) {
            using namespace qi;

            kind.add
                ("SEQUENCE", Kind::Sequence)
                ("CLUTCH", Kind::Clutch)
                ("GEAR", Kind::Gear)
                ("SLOPE", Kind::Slope)
                ;

            common_fields
                = '[' >> raw[
                        repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                        repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit
                ] >> "]"
                >> " - " >> double_ >> " s";

            base_event
                = common_fields
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> kind >> ": " >> int_;

            other_event
                = common_fields
                >> " => " >> *~char_("\r\n");

            event 
                = (base_event | other_event) 
                >> (eol|eoi);

            start = *boost::spirit::repository::qi::seek[event];
        }

      private:
        qi::rule<It, MyEvents::LogEvents()> start;
        qi::rule<It, MyEvents::LogEvent()> event;

        qi::rule<It, MyEvents::CommonFields()> common_fields;
        qi::rule<It, MyEvents::BaseEvent()> base_event;
        qi::rule<It, MyEvents::OtherEvent()> other_event;

        qi::symbols<char, MyEvents::Kind> kind;
    };
}

MyEvents::LogEvents parse_spirit(It b, It e) {
    static QiParsers::LogParser<It> const parser {};

    MyEvents::LogEvents events;
    parse(b, e, parser, events);

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

namespace MyEvents { // for debug/demo
    using boost::fusion::operator<<;

    static inline Kind getKind(BaseEvent const& be) { return be.kind; }
    static inline Kind getKind(OtherEvent const&) { return Kind::Other; }

    struct KindVisitor : boost::static_visitor<Kind> {
        template <typename T> Kind operator()(T const& ev) const { return getKind(ev); }
    };
    static inline Kind getKind(LogEvent const& ev) {
        return boost::apply_visitor(KindVisitor{}, ev);
    }

    static inline std::ostream& operator<<(std::ostream& os, Kind k) {
        switch(k) {
            case Kind::Sequence: return os << "SEQUENCE";
            case Kind::Clutch:   return os << "CLUTCH";
            case Kind::Gear:     return os << "GEAR";
            case Kind::Slope:    return os << "SLOPE";
            case Kind::Other:    return os << "(Other)";
        }
        return os;
    }
}

int main() {
    auto events = parse_spirit(input, input+len);
    std::cout << "Parsed: " << events.size() << " events\n";

    for (auto& e : events)
        std::cout << getKind(e) << ": " << e << "\n"; 
}

Prints e.g.

Parsed: 37 events
SLOPE: ((2018-Mar-13 13:13:59.580482 0.2) 0 0 A-11 90 SLOPE 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0 A-11 90 GEAR 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0.1 A-11 90 GEAR 1)
SEQUENCE: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.1 A-11 90 SEQUENCE 1)
CLUTCH: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.2 A-11 90 CLUTCH 1)
(Other): ((2018-Mar-13 13:14:01.819966 2.54) Backup to regestry)
[...]

BONUS: Multi-Index

If you use multi-index containers you can have your cake and eat it, too.

Here's a sample definition that allows you to index the vector according to some rather arbitrarily chosen features:

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/composite_key.hpp>
#include <boost/multi_index/global_fun.hpp>

namespace Indexing {
    namespace bmi = boost::multi_index;

    using MyEvents::LogEvent;

    double getDuration(LogEvent const& ev) { return getCommon(ev).duration; }

    using Table = bmi::multi_index_container<
        std::reference_wrapper<LogEvent const>, //LogEvent,
        bmi::indexed_by<
            bmi::ordered_non_unique<
                bmi::tag<struct primary>,
                bmi::composite_key<
                    LogEvent,
                    bmi::global_fun<LogEvent const&, MyEvents::Kind, MyEvents::getKind>,
                    bmi::global_fun<LogEvent const&, int,            MyEvents::getEventId>
                >
            >,
            bmi::ordered_non_unique<
                bmi::tag<struct duration>,
                bmi::global_fun<LogEvent const&, double, getDuration>
            >
        >
    >;
}

Now you can do interesting things like:

Indexing::Table idx(events.begin(), events.end());

/*
 * // To print all events, grouped by by kind and event id:
 * for (MyEvents::LogEvent const& e : idx)
 *     std::cout << getKind(e) << ": " << e << "\n"; 
 *
 * // Ordered by duration:
 * for (MyEvents::LogEvent const& e : idx.get<Indexing::duration>())
 *     std::cout << getKind(e) << ": " << e << "\n"; 
 */

std::cout << "\nAll GEAR events ordered by event id:\n";
for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Gear))))
    std::cout << getKind(e) << ": " << e << "\n"; 

std::cout << "\nOnly the SLOPE events with id 10:\n";
for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Slope, 10))))
    std::cout << getKind(e) << ": " << e << "\n"; 

std::cout << "\nEvents with durations in [2s..3s):\n";
auto& by_dur = idx.get<Indexing::duration>();

for (MyEvents::LogEvent const& e : make_iterator_range(by_dur.lower_bound(2), by_dur.upper_bound(3)))
    std::cout << getKind(e) << ": " << e << "\n";

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

namespace MyEvents {
    enum class Kind { Sequence, Clutch, Gear, Slope, Other };

    struct CommonFields {
        boost::string_view date;
        double duration;
    };

    struct BaseEvent {
        CommonFields common;
        int driver;
        int event_id;
        double vel;
        double km;
        boost::string_view road;
        Kind kind;
    };

    struct OtherEvent {
        CommonFields common;
        std::string message;
    };

    using LogEvent = boost::variant<BaseEvent, OtherEvent>;
    using LogEvents = std::vector<LogEvent>;
}

BOOST_FUSION_ADAPT_STRUCT(MyEvents::CommonFields, date, duration)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::BaseEvent, common, driver, vel, road, km, kind, event_id)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::OtherEvent, common, message)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

namespace QiParsers {
    template <typename It>
    struct LogParser : qi::grammar<It, MyEvents::LogEvents()> {
        using Kind = MyEvents::Kind;

        LogParser() : LogParser::base_type(start) {
            using namespace qi;

            kind.add
                ("SEQUENCE", Kind::Sequence)
                ("CLUTCH", Kind::Clutch)
                ("GEAR", Kind::Gear)
                ("SLOPE", Kind::Slope)
                ;

            common_fields
                = '[' >> raw[
                        repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                        repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit
                ] >> "]"
                >> " - " >> double_ >> " s";

            base_event
                = common_fields
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> kind >> ": " >> int_;

            other_event
                = common_fields
                >> " => " >> *~char_("\r\n");

            event 
                = (base_event | other_event) 
                >> (eol|eoi);

            start = *boost::spirit::repository::qi::seek[event];
        }

      private:
        qi::rule<It, MyEvents::LogEvents()> start;
        qi::rule<It, MyEvents::LogEvent()> event;

        qi::rule<It, MyEvents::CommonFields()> common_fields;
        qi::rule<It, MyEvents::BaseEvent()> base_event;
        qi::rule<It, MyEvents::OtherEvent()> other_event;

        qi::symbols<char, MyEvents::Kind> kind;
    };
}

MyEvents::LogEvents parse_spirit(It b, It e) {
    static QiParsers::LogParser<It> const parser {};

    MyEvents::LogEvents events;
    parse(b, e, parser, events);

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

namespace MyEvents { // for debug/demo
    using boost::fusion::operator<<;

    static inline CommonFields const& getCommon(BaseEvent const& be) { return be.common; }
    static inline CommonFields const& getCommon(OtherEvent const& oe) { return oe.common; }
    static inline Kind getKind(BaseEvent const& be) { return be.kind; }
    static inline Kind getKind(OtherEvent const&) { return Kind::Other; }
    static inline int getEventId(BaseEvent const& be) { return be.event_id; }
    static inline int getEventId(OtherEvent const&) { return 0; }

#define IMPL_DISPATCH(name, T)                                                                     \
    struct name##Visitor : boost::static_visitor<T> {                                              \
        template <typename E> T operator()(E const &ev) const { return name(ev); }                 \
    };                                                                                             \
    static inline T name(LogEvent const &ev) { return boost::apply_visitor(name##Visitor{}, ev); }

    IMPL_DISPATCH(getCommon, CommonFields const&)
    IMPL_DISPATCH(getKind, Kind)
    IMPL_DISPATCH(getEventId, int)

    static inline std::ostream& operator<<(std::ostream& os, Kind k) {
        switch(k) {
            case Kind::Sequence: return os << "SEQUENCE";
            case Kind::Clutch:   return os << "CLUTCH";
            case Kind::Gear:     return os << "GEAR";
            case Kind::Slope:    return os << "SLOPE";
            case Kind::Other:    return os << "(Other)";
        }
        return os;
    }
}

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/composite_key.hpp>
#include <boost/multi_index/global_fun.hpp>

namespace Indexing {
    namespace bmi = boost::multi_index;

    using MyEvents::LogEvent;

    double getDuration(LogEvent const& ev) { return getCommon(ev).duration; }

    using Table = bmi::multi_index_container<
        std::reference_wrapper<LogEvent const>, //LogEvent,
        bmi::indexed_by<
            bmi::ordered_non_unique<
                bmi::tag<struct primary>,
                bmi::composite_key<
                    LogEvent,
                    bmi::global_fun<LogEvent const&, MyEvents::Kind, MyEvents::getKind>,
                    bmi::global_fun<LogEvent const&, int,            MyEvents::getEventId>
                >
            >,
            bmi::ordered_non_unique<
                bmi::tag<struct duration>,
                bmi::global_fun<LogEvent const&, double, getDuration>
            >
        >
    >;
}

using boost::make_iterator_range;
using boost::make_tuple;

int main() {
    using MyEvents::LogEvent;
    using MyEvents::Kind;

    auto events = parse_spirit(input, input+len);
    std::cout << "Parsed: " << events.size() << " events\n";

    Indexing::Table idx(events.begin(), events.end());

    /*
     * // To print all events, grouped by by kind and event id:
     * for (MyEvents::LogEvent const& e : idx)
     *     std::cout << getKind(e) << ": " << e << "\n"; 
     *
     * // Ordered by duration:
     * for (MyEvents::LogEvent const& e : idx.get<Indexing::duration>())
     *     std::cout << getKind(e) << ": " << e << "\n"; 
     */

    std::cout << "\nAll GEAR events ordered by event id:\n";
    for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Gear))))
        std::cout << getKind(e) << ": " << e << "\n"; 

    std::cout << "\nOnly the SLOPE events with id 10:\n";
    for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Slope, 10))))
        std::cout << getKind(e) << ": " << e << "\n"; 

    std::cout << "\nEvents with durations in [2s..3s):\n";
    auto& by_dur = idx.get<Indexing::duration>();

    for (MyEvents::LogEvent const& e : make_iterator_range(by_dur.lower_bound(2), by_dur.upper_bound(3)))
        std::cout << getKind(e) << ": " << e << "\n"; 
}

Prints:

Parsed: 37 events

All GEAR events ordered by event id:
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0 A-11 90 GEAR 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0 A-11 90 GEAR 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0.1 A-11 90 GEAR 1)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0.1 A-11 90 GEAR 1)
GEAR: ((2018-Mar-13 13:14:03.250451 3.87) 0 1.2 B-302 90.2 GEAR 2)
GEAR: ((2018-Mar-13 13:14:03.250451 3.87) 0 1.2 B-302 90.2 GEAR 2)
GEAR: ((2018-Mar-13 13:14:04.510025 5.13) 0 4.9 B-302 91.1 GEAR 3)

Only the SLOPE events with id 10:
SLOPE: ((2018-Mar-13 13:14:04.300160 4.92) 0 4.2 B-302 90.9 SLOPE 10)
SLOPE: ((2018-Mar-13 13:14:04.300160 4.92) 0 4.2 B-302 90.9 SLOPE 10)

Events with durations in [2s..3s):
SEQUENCE: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.1 A-11 90 SEQUENCE 1)
CLUTCH: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.2 A-11 90 CLUTCH 1)
SEQUENCE: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.1 A-11 90 SEQUENCE 1)
CLUTCH: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.2 A-11 90 CLUTCH 1)
(Other): ((2018-Mar-13 13:14:01.819966 2.54) Backup to regestry)
(Other): ((2018-Mar-13 13:14:01.819966 2.54) Backup to regestry)

**sehe** · Accepted Answer · 2018-03-14T22:58:36.390000

I see roughly two areas for improvement:

you basically parse all lines, including the ones that don't interest you
you allocate a lot of strings

I'd suggest using string views to fix the allocations. Next, you could try to avoid parsing lines that don't match the SEQUENCE pattern. There's no reason in principle why this couldn't be done using Boost Xpressive, but my weapon of choice happens to be Boost Spirit, so I'll include it too.

Being Selective

You can detect interesting lines before spending more effort like this:

cregex signature = -*~_n >> " - SEQUENCE: " >> (+_d) >> before(_ln|eos); 
for (xp::cregex_iterator cur(b, e, signature), end; cur != end; ++cur) {
    std::cout << "'" << cur->str() << "'\n";
}

This prints

'[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1'
'[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4'
'[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8'
'[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15'
'[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21'
'[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29'
'[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34'
'[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45'

Nothing is allocated. This should be pretty fast.

Reducing Allocations

For this I'm going to switch to Spirit because it will make things easier.

Note: The real reason I switched here is because, in contrast to Boost Spirit, Xpressive does not appear to have extensible attribute propagation traits. This could be my lack of experience with it.

The alternative approach would almost certainly replace the actions with manual propagation code, which in turn would inform named capture groups in order to keep things legible. I'm not sure about the performance overhead of these, so let's not use them at this point.

You can use boost::string_view with a trait to "teach" Qi to assign text to it:

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

That way, the Qi grammar could look just like this:

template <typename It> struct QiParser : qi::grammar<It, Sequence()> {
    QiParser() : QiParser::base_type(line) {
        using namespace qi;
        auto date_time = copy(
            repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
            repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

        line = '[' >> raw[date_time] >> "] - "
            >> double_ >> " s"
            >> " => Driver: "  >> int_
            >> " - Speed: "    >> double_
            >> " - Road: "     >> raw[+graph]
            >> " - Km: "       >> double_
            >> " - SEQUENCE: " >> int_
            >> (eol|eoi);
    }
  private:
    qi::rule<It, Sequence()> line;
};

Using it is exceedingly simple, especially if not being "selective".

This happens to be the "winning" configuration. Here's the standalone, simplified version of that algorithm after removing all benchmark-related generics and options: Live on Coliru

Benchmark Results: Surprises

Using the selective parsing approach only made the Xpressive approach slower: Interactive

Comparing to Spirit, I had initially started with the selective approach as well (fully anticipating it to be faster). Here's the not-so-encouraging results: Interactive

Oops. The initial Xpressive approach is still superior!

Adjusting The Assumptions

Okay, clearly doing the shallow scan first, and then the "full parse" hurts the performance. Theorizing, this is likely down to cache/prefetch effects. Also, the linear approach may win because it's easier to spot when a line doesn't start with a '[' character, than to see whether it ends with the SEQUENCE pattern.

So I decided to adapt the spirit approaches to linear mode too, and see whether the win by reducing allocations is still worth it: Interactive

Now we're getting results. Let's look at the difference between the std::string and boost::string_view approaches in detail: Interactive

Summary/Conclusions

The reduced allocations are good for 30% more efficiency. In total, an improvement of 10 times over the original approach.

Note that the benchmark code goes out of its way to eliminate unfair differences between the implementations (e.g. by pre compiling everything on both Spirit and Xpressive). See the full benchmark code:

The winning implementation in isolation: Live on Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

struct Sequence {
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    boost::string_view date;
    boost::string_view road;
};

BOOST_FUSION_ADAPT_STRUCT(::Sequence, date, time, driver, vel, road, km, sequence)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

std::vector<Sequence> parse_spirit(It b, It e) {

    qi::rule<It, Sequence()> static const line = []{
        using namespace qi;
        auto date_time = copy(
            repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
            repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

        qi::rule<It, Sequence()> r = '[' >> raw[date_time] >> "] - "
            >> double_ >> " s"
            >> " => Driver: "  >> int_
            >> " - Speed: "    >> double_
            >> " - Road: "     >> raw[+graph]
            >> " - Km: "       >> double_
            >> " - SEQUENCE: " >> int_
            >> (eol|eoi);

        return r;
    }();

    std::vector<Sequence> sequences;

    parse(b, e, *boost::spirit::repository::qi::seek[line], sequences);

    return sequences;
}

static char input[] = /*... see question ...*/;
static const size_t len = strlen(input);

int main() {
    auto sequences = parse_spirit(input, input+len);
    std::cout << "Parsed: " << sequences.size() << " sequence lines\n";
}

Full Benchmark Code

The benchmarks use Nonius for the measurements and statistical analysis.

Full interactive graphs here: http://stackoverflow-sehe.s3.amazonaws.com/9f88e055-4b5f-4026-8f2f-54e2bcad430d/stats.html
Compile with -DUSE_NONIUS if you have Nonius available
Compile with -DVERIFY_OUTPUT for "correctness" mode: in this case no timings are done but the results of the parse are echoed for validation

#include <cstring> // strlen

static char input[] = 
"[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - SLOPE: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - GEAR: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - GEAR: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - CLUTCH: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.540 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8\n\
[2018-Mar-13 13:14:01.819966] - 3.110 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15\n\
[2018-Mar-13 13:14:02.829983] - 3.450 s => Driver: 0 - Speed: 0.6 - Road: B-302 - Km: 90.1 - SLOPE: -5\n\
[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21\n\
[2018-Mar-13 13:14:03.250451] - 3.870 s => Driver: 0 - Speed: 1.2 - Road: B-302 - Km: 90.2 - GEAR: 2\n\
[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29\n\
[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34\n\
[2018-Mar-13 13:14:03.880066] - 4.500 s => Driver: 0 - Speed: 2.8 - Road: B-302 - Km: 90.5 - CLUTCH: 1\n\
[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45\n\
[2018-Mar-13 13:14:04.300160] - 4.920 s => Driver: 0 - Speed: 4.2 - Road: B-302 - Km: 90.9 - SLOPE: 10\n\
[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - SLOPE: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - GEAR: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - GEAR: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - CLUTCH: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.540 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8\n\
[2018-Mar-13 13:14:01.819966] - 3.110 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15\n\
[2018-Mar-13 13:14:02.829983] - 3.450 s => Driver: 0 - Speed: 0.6 - Road: B-302 - Km: 90.1 - SLOPE: -5\n\
[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21\n\
[2018-Mar-13 13:14:03.250451] - 3.870 s => Driver: 0 - Speed: 1.2 - Road: B-302 - Km: 90.2 - GEAR: 2\n\
[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29\n\
[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34\n\
[2018-Mar-13 13:14:03.880066] - 4.500 s => Driver: 0 - Speed: 2.8 - Road: B-302 - Km: 90.5 - CLUTCH: 1\n\
[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45\n\
[2018-Mar-13 13:14:04.300160] - 4.920 s => Driver: 0 - Speed: 4.2 - Road: B-302 - Km: 90.9 - SLOPE: 10\n\
[2018-Mar-13 13:14:04.510025] - 5.130 s => Driver: 0 - Speed: 4.9 - Road: B-302 - Km: 91.1 - GEAR: 3";
static const size_t len = strlen(input);

#include <boost/utility/string_view.hpp>
#include <boost/fusion/adapted/struct.hpp>

template <typename String> struct Sequence {
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    String date;
    String road;
};

BOOST_FUSION_ADAPT_TPL_STRUCT((T),(Sequence)(T), date, time, driver, vel, road, km, sequence)

// Declare implementations under test:
using It = char const*;
template <typename S> std::vector<S> parse_xpressive_linear(It b, It e);
template <typename S> std::vector<S> parse_xpressive_selective(It b, It e);
template <typename S> std::vector<S> parse_spirit_linear(It b, It e);
template <typename S> std::vector<S> parse_spirit_selective(It b, It e);

#ifdef VERIFY_OUTPUT
    #include <boost/fusion/include/io.hpp>
    using boost::fusion::operator<<;
    #include <iostream>

    #define VERIFY()                                                                    \
        do {                                                                            \
            std::cout << "L:" << __LINE__ << " Parsed: " << sequences.size() << "\n";   \
            for (auto r : sequences) {                                                  \
                std::cout << r << "\n";                                                 \
            }                                                                           \
        } while (0)
#else
    #define VERIFY() do { } while (0)
#endif

#ifdef USE_NONIUS
    #include <nonius/benchmark.h++>
    #define NONIUS_RUNNER
    #include <nonius/main.h++>
#else
    // mock nonius
    namespace nonius {
        struct chronometer{
            template <typename F> static inline void measure(F&& f) { std::forward<F>(f)(); }
        };
        static std::vector<std::function<void(chronometer)>> s_benchmarks;
        #define TOKENPASTE(x, y) x ## y
        #define TOKENPASTE2(x, y) TOKENPASTE(x, y)
        #define NONIUS_BENCHMARK(name, f) static auto TOKENPASTE2(s_reg_, __LINE__) = []{ ::nonius::s_benchmarks.push_back(f); return 42; }();

        void run() { for (auto& b : s_benchmarks) b({}); }
    }

    int main() {
        nonius::run();
    }
#endif

template <typename R>
void do_test_kernel(nonius::chronometer& cm, std::vector<R> (*f)(It, It)) {
    std::vector<R> sequences;
    cm.measure([&sequences,f]{ sequences = f(input, input + len); });
    VERIFY();
}

#define TEST_CASE(name, string) NONIUS_BENCHMARK(#name"-"#string, [](nonius::chronometer cm) { do_test_kernel(cm, &name<Sequence<string> >); })
// Xpressive doesn't support string_view
TEST_CASE(parse_xpressive_linear,    std::string)
TEST_CASE(parse_xpressive_selective, std::string)

TEST_CASE(parse_spirit_linear,       std::string)
TEST_CASE(parse_spirit_linear,       boost::string_view)
TEST_CASE(parse_spirit_selective,    std::string)
TEST_CASE(parse_spirit_selective,    boost::string_view)

#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>

namespace xp = boost::xpressive;

namespace XpressiveDetail {
    using namespace xp;

    struct Scanner {
        cregex scan {-*~xp::_n >> " - SEQUENCE: " >> (+xp::_d) >> xp::_ln};
    };

    template <typename Seq> struct Parser : Scanner {
        mutable Seq seq; // non-thread-safe, but fairer to compare to Spirit

        cregex real    = (+_d >> '.' >> +_d);
        cregex keyword = " - SEQUENCE: " >> (+_d)[xp::ref(seq.sequence) = as<int>(_)];
        cregex date    = repeat<4>(_d) >> '-' 
            >> repeat<3>(alpha) >> '-' 
            >> repeat<2>(_d) 
            >> _s 
            >> repeat<2>(_d) >> ':' 
            >> repeat<2>(_d) >> ':' 
            >> repeat<2>(_d)
            >> '.' >> (+_d);

        cregex header = '[' >> date[xp::ref(seq.date) = _] >> "] - "
            >> real[xp::ref(seq.time) = as<double>(_)]
            >> " s => Driver: " >> (+_d)             [ xp ::ref(seq.driver) = as<int>(_) ]
            >> " - Speed: "     >> real              [ xp ::ref(seq.vel)    = as<double>(_) ]
            >> " - Road: "      >> (+set[alnum|'-']) [ xp ::ref(seq.road)   = _ ]
            >> " - Km: "        >> real              [ xp ::ref(seq.km)     = as<double>(_) ];

        cregex parser = (header >> keyword >> _ln);
    };
}

template <typename Seq>
std::vector<Seq> parse_xpressive_linear(It b, It e) {
    std::vector<Seq> sequences;
    using namespace xp;

    static const XpressiveDetail::Parser<Seq> precompiled{};

    for (xp::cregex_iterator cur(b, e, precompiled.parser), end; cur != end; ++cur)
        sequences.push_back(std::move(precompiled.seq));

    return sequences;
}

template <typename Seq>
std::vector<Seq> parse_xpressive_selective(It b, It e) {
    std::vector<Seq> sequences;
    using namespace xp;

    static const XpressiveDetail::Parser<Seq> precompiled{};
    xp::match_results<It> m;

    for (auto& match : boost::make_iterator_range(xp::cregex_iterator{b, e, precompiled.scan}, {})) {
        if (xp::regex_match(match[0].first, match[0].second, m, precompiled.parser))
            sequences.push_back(std::move(precompiled.seq));
    }

    return sequences;
}

//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

template <typename It, typename Attribute> struct QiParser : qi::grammar<It, Attribute()> {
    QiParser() : QiParser::base_type(line) {
        using namespace qi;
        auto date_time = copy(
            repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
            repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

        line = '[' >> eps(clear(_val)) >> raw[date_time] >> "] - "
            >> double_ >> " s"
            >> " => Driver: "  >> int_
            >> " - Speed: "    >> double_
            >> " - Road: "     >> raw[+graph]
            >> " - Km: "       >> double_
            >> " - SEQUENCE: " >> int_
            >> (eol|eoi);

        BOOST_SPIRIT_DEBUG_NODES((line))
    }
  private:
    struct clear_f {
        // only required for linear approach to std::string-based
        bool operator()(Sequence<std::string>& v)      const { v = {};      return true; }
        bool operator()(Sequence<boost::string_view>&) const { /*no_op();*/ return true; }
    };
    boost::phoenix::function<clear_f> clear;

    qi::rule<It, Attribute()> line;
};

template <typename Seq = Sequence<std::string> >
std::vector<Seq> parse_spirit_selective(It b, It e) {
    static QiParser<It, Seq> const qi_parser{};
    static XpressiveDetail::Scanner const precompiled{};

    std::vector<Seq> sequences;

    for (auto& match : boost::make_iterator_range(xp::cregex_iterator{b, e, precompiled.scan}, {})) {
        Seq r;
        if (parse(match[0].first, match[0].second, qi_parser, r))
            sequences.push_back(r);
    }

    return sequences;
}

#include <boost/spirit/repository/include/qi_seek.hpp>

template <typename Seq = Sequence<std::string> >
std::vector<Seq> parse_spirit_linear(It b, It e) {
    using boost::spirit::repository::qi::seek;

    static QiParser<It, Seq> const qi_parser{};

    std::vector<Seq> sequences;
    parse(b, e, *seek[qi_parser], sequences);
    return sequences;
}

Sample text report:

clock resolution: mean is 17.7534 ns (40960002 iterations)

benchmarking parse_xpressive_linear-std::string
collecting 100 samples, 1 iterations each, in estimated 15.7252 ms
mean: 156.418 μs, lb 155.863 μs, ub 158.24 μs, ci 0.95
std dev: 4.62848 μs, lb 1637.89 ns, ub 10.4043 μs, ci 0.95
found 4 outliers among 100 samples (4%)
variance is moderately inflated by outliers

benchmarking parse_xpressive_selective-std::string
collecting 100 samples, 1 iterations each, in estimated 31.5459 ms
mean: 313.992 μs, lb 313.39 μs, ub 315.599 μs, ci 0.95
std dev: 4.5415 μs, lb 1105.98 ns, ub 9.07809 μs, ci 0.95
found 11 outliers among 100 samples (11%)
variance is slightly inflated by outliers

benchmarking parse_spirit_linear-std::string
collecting 100 samples, 1 iterations each, in estimated 2.1556 ms
mean: 21.2533 μs, lb 21.1623 μs, ub 21.6854 μs, ci 0.95
std dev: 870.481 ns, lb 53.2809 ns, ub 2.0738 μs, ci 0.95
found 7 outliers among 100 samples (7%)
variance is moderately inflated by outliers

benchmarking parse_spirit_linear-boost::string_view
collecting 100 samples, 2 iterations each, in estimated 2.944 ms
mean: 14.6677 μs, lb 14.6342 μs, ub 14.8279 μs, ci 0.95
std dev: 318.252 ns, lb 22.5097 ns, ub 757.555 ns, ci 0.95
found 5 outliers among 100 samples (5%)
variance is moderately inflated by outliers

benchmarking parse_spirit_selective-std::string
collecting 100 samples, 1 iterations each, in estimated 27.5512 ms
mean: 273.052 μs, lb 272.77 μs, ub 273.952 μs, ci 0.95
std dev: 2.31473 μs, lb 835.184 ns, ub 5.1322 μs, ci 0.95
found 10 outliers among 100 samples (10%)
variance is unaffected by outliers

benchmarking parse_spirit_selective-boost::string_view
collecting 100 samples, 1 iterations each, in estimated 27.0766 ms
mean: 269.446 μs, lb 269.208 μs, ub 270.268 μs, ci 0.95
std dev: 2.01634 μs, lb 627.834 ns, ub 4.56949 μs, ci 0.95
found 10 outliers among 100 samples (10%)
variance is unaffected by outliers

Slow performance using boost xpressive

There are 2 best solutions below

Being Selective

Reducing Allocations

Benchmark Results: Surprises

Adjusting The Assumptions

Summary/Conclusions

Full Benchmark Code

Separate `vector`s with a trait

Flipping It Around: 1 `vector<variant<>>`

Generalize: Common Fields and Other Events

BONUS: Multi-Index

Related Questions in BOOST

Related Questions in BOOST-SPIRIT

Related Questions in BOOST-XPRESSIVE

Trending Questions

Popular # Hahtags

Popular Questions

Slow performance using boost xpressive

There are 2 best solutions below

Being Selective

Reducing Allocations

Benchmark Results: Surprises

Adjusting The Assumptions

Summary/Conclusions

Full Benchmark Code

Separate vectors with a trait

Flipping It Around: 1 vector<variant<>>

Generalize: Common Fields and Other Events

BONUS: Multi-Index

Related Questions in BOOST

Related Questions in BOOST-SPIRIT

Related Questions in BOOST-XPRESSIVE

Trending Questions

Popular # Hahtags

Popular Questions

Separate `vector`s with a trait

Flipping It Around: 1 `vector<variant<>>`