Extract email from string using Template Tookit

Question

Extract email from string using Template Tookit

165 Views Asked by Dario Zadro At 22 March 2021 at 21:49

I'm guessing this is relatively simple, but I can't find the answer.

From a string such as '"John Doe" <[email protected]>' - how can I extract the email portion from it using Template Tookit?

An example string to parse is this:

$VAR1 = { 
    'date' => '2021-03-25',
    'time' => '03:58:18',
    'href' => 'https://example.com',
    'from' => '[email protected] on behalf of Caroline <[email protected]>',
    'bytes' => 13620,
    'pmail' => '[email protected]',
    'sender' => '[email protected]',
    'subject' => 'Some Email Subject'
};

My code, based on @dave-cross help below where $VAR1 is the output of dumper.dump(item.from)

[% text = item.from -%]
[% IF (matches = text.match('(.*?)(\s)?+<(.*?)>')) -%]
<td>[% matches.1 %]</td>
[% ELSE -%]
<td>[% text %]</td>
[% END %]

However, it's still not matching against $VAR1

Original Q&A

There are 3 best solutions below

choroba On 22 March 2021 at 22:13

I have no idea how Template Toolkit can help you. Use Email::Address or Email::Address::XS to parse an e-mail address.

brian d foy On 23 March 2021 at 14:19

There's a very old (and unmaintained) module, Template::Extract, that let's you define a template, then work backward from a string that might have been produced by that template:

use Template::Extract;
use Data::Dumper;

my $obj = Template::Extract->new;
my $template = qq("[% name %]" <[% email %]>);

my $string = '"John Doe" <[email protected]>';

my $extracted = $obj->extract($template, $string);

print Dumper( $extracted );

The output is:

$VAR1 = {
          'email' => '[email protected]',
          'name' => 'John Doe'
        };

However, there are modules that already do this job for you and will handle many more situations

**Dave Cross** · Accepted Answer · 2021-03-24T18:50:46.783000

This does what you want, but it's pretty fragile and this really isn't the kind of thing that you should be doing in TT code. You should either get the data parsed outside of the template and passed into variables, or you should pass in a parsing subroutine that can be called from inside the template.

But, having given you the caveats, if you still insist this is what you want to do, then this is how you might do it:

In test.tt:

[% text = '"John Doe" <[email protected]>';
   matches = text.match('"(.*?)"\s+<(.*?)>');
   IF matches -%]
Name: [% matches.0 %]
Email: [% matches.1 %]
[% ELSE -%]
No match found
[% END -%]

Then, testing using tpage:

$ tpage test.tt
Name: John Doe
Email: [email protected]

But I cannot emphasise enough that you should not be doing it like this.

Update: I've used this test template to investigate your further problem.

[% item = { from => '"John Doe" <[email protected]>' };
   text = item.from -%]
[% IF (matches = text.match('(.*?)(\s)?+<(.*?)>')) -%]
<td>[% matches.1 %]</td>
[% ELSE -%]
<td>[% text %]</td>
[% END %]

And running it, I get this:

$ tpage test2.tt
<td> </td>

That's what I'd expect to see for a match. You're printing matches.1. That's the second item from the matches array. And the second match group is (\s). So I'm getting the space between the name and the opening angle bracket.

You probably don't want that whitespace match in your matches array, so I'd remove the parentheses around it, to make the regex (.*?)\s*<(.*?)> (note that \s* is a simpler way to say "zero or more whitespace characters").

You can now use matches.0 to get the name and matches.1 to get the email address.

Oh, and there's no need to copy items.from into text. You can call the matches vmethod on any scalar variable, so it's probably simpler to just use:

[% matches = item.from.match(...) -%]

Did I mention that this is all a really terrible idea? :-)

Update2:

This is all going to be far easier if you give me complete, runnable code examples in the same way that I am doing for you. Any time I have to edit something in order to get an example running, we run the risk that I'm guessing incorrectly how your code works.

But, bearing that in mind, here's my latest test template:

[% item = {
    'date' => '2021-03-25',
    'time' => '03:58:18',
    'href' => 'https://example.com',
    'from' => '[email protected] on behalf of Caroline <[email protected]>',
    'bytes' => 13620,
    'pmail' => '[email protected]',
    'sender' => '[email protected]',
    'subject' => 'Some Email Subject'
};
   text = item.from -%]
[% IF (matches = text.match('(.*?)(\s)?<(.*?)>')) -%]
<td>[% matches.2 %]</td>
[% ELSE -%]
<td>[% text %]</td>
[% END %]

I've changed the definition of item to have your full example. I've left the regex as it was before my suggestions. And (because I haven't changed the regex) I've changed the output to print matches.2 instead of matches.1.

And here's what happens:

$ tpage test3.tt
<td>[email protected]</td>

So it works.

If yours doesn't work, then you need to identify the differences between my (working) code and your (non-working) code. I'm happy to help you identify those differences, but you have to give my your non-working example in order for me to do that.

Update3:

Again I've tried to incorporate the changes that you're talking about. But again, I've had to guess at stuff because you're not sharing complete runnable examples. And again, my code works as expected.

[% USE dumper -%]
[% item = {
    'date' => '2021-03-25',
    'time' => '03:58:18',
    'href' => 'https://example.com',
    'from' => '[email protected] on behalf of Caroline <[email protected]>',
    'bytes' => 13620,
    'pmail' => '[email protected]',
    'sender' => '[email protected]',
    'subject' => 'Some Email Subject'
};
 -%]
[% matches = item.from.match('(.*?)(\s)?<(.*?)>') -%]
[% dumper.dump(matches) %]

And testing it:

$ tpage test4.tt
$VAR1 = [
          '[email protected] on behalf of Caroline',
          ' ',
          '[email protected]'
        ];

So that works. If you want any more help, then send a complete runnable example. If you don't do that, I won't be able to help you any more.

Extract email from string using Template Tookit

There are 3 best solutions below

Related Questions in PERL

Related Questions in TEMPLATE-TOOLKIT

Trending Questions

Popular # Hahtags

Popular Questions