Finding tokens in a Smalltalk String with PetitParser

1.1k Views Asked by At

I want to parse

'This,is,an,example,text'

like in findTokens

'This,is,an,example,text' findTokens: $, 
an OrderedCollection('This' 'is' 'an' 'example' 'text')

but cannot figure out how to do it with PetitParser, delimitedBy: and separatedBy: didn't helped me I tried

( #any asParser delimitedBy: $, asParser ) plus flatten parse:  'This,is,an,example,text'

but obviously not worked

4

There are 4 best solutions below

1
On BEST ANSWER

I use this pattern all the time with PetitParser when I want to exclude something. Just define either "what I'm looking for" or "what I want to exclude" (whichever's easier to describe) as a parser, and then negate it, and process as necessary.

s := 'This,is,an,example,text'.
separator := $, asParser ==> [ :n | nil ].
token := separator negate plus flatten.
p := (token separatedBy: separator) ==> [ :nodes |
    nodes copyWithout: nil ].
p parse: s.
0
On

a #delimitedBy: b expands to a , (b , a) star, so your parser as-is is saying "give me one character separated by commas".

It's not very readable, but this does what you want:

((($, asParser not , #any asParser) ==> [:nodes | nodes second])
  plus flatten delimitedBy: $, asParser

The first clause says "parse anything that isn't a comma". So given '12,24' you get #('12' $, '24').

2
On

Try

(#word asParser plus flatten separatedBy: $, asParser) 
     ==> [:nodes| nodes copyWithout: $, ]

I hope I understood what you wanted

0
On

You can use delimitedBy: in combination with withoutSeparators:

|text parser|

text := 'This,is,an,example,text'.
parser := (#word asParser plus flatten delimitedBy: ($, asParser)) withoutSeparators.

parser parse: text

Seems to be a recent improvement to PetitParser.