I have the following text in my dataset:
[1] "q negociação c/v tipo mercado prazo especificação do título obs (*) quantidade preço / ajuste valor operação / ajuste d/c 1-bovespa c fracionario magaz luiza on eb nm # 1 25,76 25,76 d 1-bovespa c fracionario magaz luiza on eb nm # 9 25,76 231,84 d 1-bovespa c fracionario magaz luiza on eb nm 40 25,76 1030,40 d 1-bovespa c fracionario mrv on ed nm 40 18,14 725,60 d resumo dos negócios"
I would like to extract the various texts between two standards, specifically the texts contained between "1-bovespa" and "d". Currently, I use the str_extract the readtext package but it does so for only the first pattern found. However, I would like the command to scroll through all the text, and as it finds the pattern again, build a data frame.
I'm trying something like this:
str_extract_all(out, "\\(1-bovespa).+?\\d")
here's a different approach using the repeated pattern as delimiters. It's a bit hacky, but seems to work:
With the result: