I'm trying to parse iCalendar (RFC2445) input using a regex.
Here's a [simplified] example of what the input looks like:
BEGIN:VEVENT
abc:123
def:456
END:VEVENT
BEGIN:VEVENT
ghi:789
END:VEVENT
I'd like to get an array of matches: the "outer" match is each VEVENT block and the inner matches are each of the field:value pairs.
I've tried variants of this:
BEGIN:VEVENT\n((?<field>(?<name>\S+):\s*(?<value>\S+)\n)+?)END:VEVENT
But given the input above, the result seems to have only ONE field for each matching VEVENT, despite the +? on the capture group:
**Match 1**
field   def:456
name    def
value   456
**Match 2**
field   ghi:789
name    ghi
value   789
In the first match, I would have expected TWO fields: the abc:123 and the def:456 matches...
I'm sure this is a newbie mistake (since I seem to perpetually be a newbie when it comes to regex's...) - but maybe you can point me in the right direction?
Thanks!
                        
You need to split your regex up into one matching a
VEVENTand one matching the name/value pairs. You can then use nestedscanto find all occurences, e. g.where
stris your input. This outputs:If you want to make the code more readable, i suggest you
require 'english'and replace$~with$LAST_MATCH_INFO