how to ignore the trailing slash in a preg replace

281 Views Asked by At

Say I have 2 variations of a link that get posted on my site...

(the difference is the trailing /)

Once posts are submitted on my site they turn into hyperlinks that look like the following:

<a href="https://vine.co/v/iF20jKHvnqg" target="_blank">https://vine.co/v/iF20jKHvnqg</a>

I have set up a preg replace to capture vine links and convert them into embeds (this post message would contain more obviously but this is for example sake:

$this->post['message'] = '<a href="https://vine.co/v/iF20jKHvnqg" target="_blank">https://vine.co/v/iF20jKHvnqg</a>';

$drc_embed_vine =  '<iframe src="https://vine.co/v/\2/embed/simple" width="480" height="480" frameborder="0"></iframe>';

$this->post['message'] = preg_replace('~(<a href="https?://vine.co)/v/(.*)" target="_blank">(https?://vine.co)/v/(.*)<\/a>~', $drc_embed_vine, $this->post['message']);

I use the wildcard (.*) which I thought meant 'ANYTHING' but for some reason if a link is posted with the trailing slash it doesn't get converted...

I have tried changing my reg ex to (just a couple examples I've tried)

~(<a href="https?://vine.co)/v/(.*)/" target="_blank">(https?://vine.co)/v/(.*)/<\/a>~

which then converts the link with the trailing backslash and ignores the one without.

~(<a href="https?://vine.co)/v/(.*)/?" target="_blank">(https?://vine.co)/v/(.*)/?<\/a>~

which I kinda just thought hey maybe the ? I use for the https check would do the same thing, but did nothing.

then said hey wait thats not in the reg ex so I tried it like

~(<a href="https?://vine.co)/v/(.*/?)" target="_blank">(https?://vine.co)/v/(.*/?)<\/a>~

But still no luck.

How can I make my replace not care if there is a trailing backslash or not?

3

There are 3 best solutions below

6
On

If you only need this very specific replacement, you can just concatenate strings.

$message = rtrim($post['message'], '/');
$message = sprintf('<iframe src="%s/embed/simple" width="480" height="480" frameborder="0"></iframe>', $message);

Or if you really want to use preg_replace:

$pattern = '~https?://vine.co/v/([^/]+)~';
$this->post['message'] = preg_replace($pattern, $drc_embed_vine, $this->post['message']);

Your pattern need to match the input string ($this->post['message']). Then, put the matching result ($1) in the final string.

To not care for the trailing slash, just consider the video ID will never contain a slash: obviously, it's alphanumerical [a-zA-Z0-9]. We take all characters except the trailing slash with ([^/]+). You could use ([a-z0-9]) with i modifier.

You built a pattern on the final string and trying to match it with the input string.

This script:

<?php
$message = 'https://vine.co/v/iF20jKHvnqg/';

$drc_embed_vine = '<iframe src="https://vine.co/v/\1/embed/simple" width="480" height="480" frameborder="0"></iframe>';

$pattern = '~https?://vine.co/v/([^/]+)/?~';

echo preg_replace($pattern, $drc_embed_vine, $message);

produces this:

<iframe src="https://vine.co/v/iF20jKHvnqg/embed/simple" width="480" height="480" frameborder="0"></iframe>

EDIT

Based on your comment, here is a new pattern, to match the link on the URL submitted:

$pattern = '~^(<[^>]+>)https?://vine.co/v/([^/]+)/?(</a>)$~';

This pattern can match <a href="https://vine.co/v/iF20jKHvnqg" target="_blank">https://vine.co/v/iF20jKHvnqg</a>.

The replace string changes slightly:

'<iframe src="https://vine.co/v/$2/embed/simple" width="480" height="480" frameborder="0"></iframe>'

So I have this test script, which will replace a link as you mention by the iframe:

<?php

$message = '<a href="https://vine.co/v/iF20jKHvnqg" target="_blank">https://vine.co/v/iF20jKHvnqg</a>';

$drc_embed_vine = '<iframe src="https://vine.co/v/$2/embed/simple" width="480" height="480" frameborder="0"></iframe>';

$pattern = '~^(<[^>]+>)https?://vine.co/v/([^/]+)/?(</a>)$~';

echo preg_replace($pattern, $drc_embed_vine, $message);
0
On

Here's a parser example:

$string = '<a href="https://vine.co/v/iF20jKHvnqg" target="_blank">https://vine.co/v/iF20jKHvnqg</a>';
$doc = new DOMDocument();
$doc->loadHTML($string);
$links = $doc->getElementsByTagName('a');
foreach($links as $link) {
    if(preg_match('~^https?://vine\.co/v/([^/]+)~', $link->getAttribute('href'), $url)){
        echo '<iframe src="https://vine.co/v/' . $url[1] . '/embed/simple" width="480" height="480" frameborder="0"></iframe>';
    }
}

Output:

<iframe src="https://vine.co/v/iF20jKHvnqg/embed/simple" width="480" height="480" frameborder="0"></iframe>

Demo: https://eval.in/569642

0
On

Answered in another question I asked, it doesn't ignore the trailing slash but simply removes it all together.

$this->post['message'] = preg_replace('+/(["<])+', '$1', $this->post['message']);

rtrim can not work since / is not the last of the string.