I have a confusion on how to fetch the result after running the function regex_search in the std::tr1::regex. Following is a sample code to demonstrate my issue.
string source = "abcd 16000 ";
string exp = "abcd ([^\\s]+)";
std::tr1::cmatch res;
std::tr1::regex rx(exp);
while(std::tr1::regex_search(source.c_str(), res, rx, std::tr1::regex_constants::match_continuous))
{
//HOW TO FETCH THE RESULT???????????
std::cout <<" "<< res.str()<<endl;
source = res.suffix().str();
}
The regular expression mentioned should ideally strip off the "abcd" from the string and return me 16000.
I see that the cmatch res has TWO objects. The second object contains the expected result.(this object has three members (matched, first, second). and the values are {true, "16000", " "}.
My question is what does this size of the object denote? Why is it showing 2 in this specific case( res[0] and res[1]) when I have run regex_search only once? And how do I know which object would have the expected result?
Thanks Sunil
As stated here:
This means
match[0]
should - in this case! - hold your fullsource
(abcd 16000
) as you match the whole thing, whilematch[1]
contains the content of your capturing group.If there was, for example, a second capturing group in your regex you'd get a third object in the match-collection and so on.
I'm a guy who understands visualized problems/solutions better, so let's do this:
See the demo@regex101.
See the two colors in the textfield containing the teststring?
The green color is the background for your capturing group while the
blue color represents everything else generally matched by the expression, but not captured by any group.
In other words: blue+green is the equivalent for
match[0]
and green formatch[1]
in your case.This way you can always know which of the objects in
match
refers to which capturing group:You initialize a counter in your head, starting at 0. Now go through the regex from the left to the right, add 1 for each
(
and subtract 1 for each)
until you reach the opening bracket of the capturing group you want to extract. The number in your head is the array index.EDIT
Regarding your comment on checking
res[0].first
:The member
first
of thesub_match
class is onlyWhile
second
denotes the position of the end of the match.(taken from boost doc)
Both return a
char*
(VC++10) or aniterator
(Boost), thus you get a substring of the sourcestring as the output (which may be the full source in case the match starts at index zero!).Consider the following program (VC++10):
Execute it and look at the output. The first (and only) match is - obviously - the first to chars
ab
, so this is actually the whole matched string and the reason whyres[0] == "ab"
.Now, knowing that
.first
/.second
give us substrings from the start of the match and from the end of the match onwards, the output shouldn't be confusing anymore.