"Xpressive leak" fixed, but not understood

104 Views Asked by At

I know. Xpressive is (probably) not at fault here, but I've put a lot of effort into finding the memory leaks and I had to adapt the code layout to fix the haemorrhage.

Can someone explain to me why the change in layout fixed it? I don't see why the (correct/improved) use of "static const" fixes the leaks.

BTW, the leaks were occurring on a MIPs core, using boost version 1.49, and cross compiled with GCC 4.3.3.

Original "sieve" code:

// source.cpp
#include <boost/...

cregex token = keep(+set[ alnum|'!'|'%'|'_'|'*'|'+'|'.'|'\''|'`'|'~'|'-']);
cregex more  = ...

bool foo(const char * begin, const char * end, size_t& length, std::string& dest)
{
    mark_tag name_t(1);
    cregex regx = bos >>
        icase("name:") >>
        (name_t= token) >> eos;

    cmatch what;
    bool ok = regex_search( begin, end, what, regx );
    ...
    return ok;
}

Fixed "non-leaky" code:

// header.hpp
#include <boost/...

class Xpr {
public:
    static const cregex token;
    static const cregex more;
};

// source.cpp
#include "header.hpp"

const cregex Xpr::token = keep(+set[ alnum|'!'|'%'|'_'|'*'|'+'|'.'|'\''|'`'|'~'|'-']);
const cregex Xpr::more  = ...

bool foo(const char * begin, const char * end, size_t& length, std::string& dest)
{
    mark_tag name_t(1);
    static const cregex regx = bos >>
        icase("name:") >>
        (name_t= Xpr::token) >> eos;

    cmatch what;
    bool ok = regex_search( begin, end, what, regx );
    ...
    return ok;
}

The leaks seemed to be occurring upon every call of foo!

1

There are 1 best solutions below

1
On

EDIT: After writing the below response, I tried to reproduce your problem and was unable to. Here is the code I'm using.

#include <boost/xpressive/xpressive.hpp>
using namespace boost;
using namespace xpressive;

cregex token = keep(+set[ alnum|'!'|'%'|'_'|'*'|'+'|'.'|'\''|'`'|'~'|'-']);
//cregex more  = ...

bool foo(const char * begin, const char * end)
{
    mark_tag name_t(1);
    cregex regx = bos >>
        icase("name:") >>
        (name_t= token) >> eos;

    cmatch what;
    bool ok = regex_search( begin, end, what, regx );
    //...
    return ok;
}

int main()
{
    char const buf[] = "name:value";
    while(true)
        foo(buf, buf + sizeof(buf) - 1);
}

This code doesn't leak. Is it possible you're using an earlier version of xpressive? Could you post a complete, self-contained example so I can investigate? Even better, file a bug and attach the code there. Thanks,

Eric

-----Begin Original Response-----

I suspect that you're running afoul of xpressive's cycle tracking code. See here for a warning against nesting global regex objects in function-local ones. I think what's happening is that, in order to prevent dangling references, the function-local regx must hold a reference to token, and token must hold a (weak) reference back to regx. What is growing is token's map of weak references. This isn't a leak in the strict technical sense, since the memory will get reclaimed when token gets destroyed. But it's obviously not ideal.

I fixed a similar "leak" in xpressive a while back by adding an opportunistic purge of the map to clear out weak references to expired regexes. I have to look into why it's not happening in this case. Please file a bug here. Thanks.

In the mean time, your fix is plenty good. Declaring the function-local regx static means it will only get constructed once, so token's weak-reference map will never grow beyond size 1.