I have the following sample text:
zip 20193
New York
USA
What I would like to do, is match only "New York" i.e., the line after the zipcode.
I tried using this code but it is not working -
DECLARE heading; pin BREAK #{-> MARK(heading)} BREAK;
(I have declared pin before this).
Please let me know how to go about this.
Thanks!
The problem is probably the filtering setting. BREAK is by default not visible. It will never be a successful match because ruta will automatically skip the line breaks.
Try to add another rule changing the filtering setting in front of your rule:
There could be another problem because BREAK represents \n and \r. Thus, the rule would not work for windows line endings. You would need something like:
There is a utils analysis engine in ruta for annotating lines: PlainTextAnnotator If you include it, you can write something like:
(You maybe need to trim the Lines, e.g., with the TRIM action if the lines start or end with whitespaces)
DISCLAIMER: I am a developer of UIMA Ruta