I was using str_replace to rewrite URLs to PDFs from https://example.com/documents/en/whatever.PDF to https://example.com/documents/es/whatever_SPANISH.pdf
This is what I was using
if($_COOKIE['googtrans'] == "/en/es") { //Check the google translate cookie
$text = str_replace('/documents/en/', '/documents/es/', $text);
$text = str_replace('.pdf', '_SPANISH.pdf', $text);
}
The problem is, if the page contains a PDF linked to another page (not my own website), example https://othersite.example.com/whatever.pdf, it becomes https://othersite.example.com/whatever_SPANISH.pdf which isn't valid on other people's sites. I want to ignore offsite links and only change URLs on my site.
So what I would like to do is look for the string: https://example.com/documents/en/whateverfilename.pdf and pull that file name out and change it to https://example.com/documents/es/whateverfilename_SPANISH.pdf (Switching the en to es and also appending the _SPANISH to the end of the PDF filename.
How can I do this. Have tried various preg_replace but can't get my syntax right.
if($_COOKIE['googtrans'] == "/en/es") {
$text = str_replace('/documents/en/', '/documents/es/', $text);
$text = str_replace('.pdf', '_SPANISH.pdf', $text);
}
You could do the replacement in 1 go using a regex and 2 capture group values in the replacement.
Or match the domain name:
The pattern matches:
\bA word boundary(https?://\S*?/documents/)Capture group 1, match the protocol and then optional non whitespace characters until the first occurrence of/documents/enMatch literally(/\S*)Capture group 2, match/followed by optional non whitspace chars\.pdf\bMatch.pdffollowed by a word boundaryIn the replacement use the 2 capture groups denoted by
$1and$2:See the regex group captures.
Example:
Output
If you want to match the same amount of forward slashes as in your example, you can make use of a negated character class
[^\s/]to exclude matching whitespace characters or forward slashes:See another regex demo.