strtr() partially not work

234 Views Asked by At

I build a script that should generate an sitemap for my project.

This script use strtr() to replace unwanted signs and also convert German umlauts.

    $ers = array( '<' => '', '>' => '', ' ' => '-',  'Ä' => 'Ae', 'Ö' => 'Oe', 'Ü' => 'Ue', 'ä' => 'ae', 'ö' => 'oe', 'ü' => 'ue', 'ß' => 'ss', '&' => 'und', '*' => '', ' - ' => '-', ',' => '', '.' => '', '!' => '', '?' => '' );
foreach ($rs_post as $row) { 
  $kategorie = $row['category'];
  $kategorie = strtr($kategorie,$ers);
  $kategorie = strtolower($kategorie);
  $kategorie = trim($kategorie);
  $org_file .= "<url><loc>https://domain.org/kategorie/" . $kategorie . "/</loc><lastmod>2016-08-18T19:02:42+00:00</lastmod><changefreq>monthly</changefreq><priority>0.2</priority></url>" . PHP_EOL;
}

Unwanted signs like "<" will be replaced correctly, but the German umlauts are not converted. I have no idea why.

Someone has a tipp for me?

Torsten

2

There are 2 best solutions below

1
On BEST ANSWER

As others have noted, the most likely cause is a character encoding mismatch. Since the titles you're trying to convert are apparently in UTF-8, the problem is most likely that your PHP source code isn't. Try re-saving the file as UTF-8 text, and see if that fixes the problem.

BTW, a simple way to debug this would be to print out both your data rows and your transliteration array into the same output file using e.g. print_r() or var_dump(), and look at the output to see if the non-ASCII characters in it look correct. If the characters look right in the data but wrong in the transliteration table (or vice versa), that's a sign that the encodings don't match.

Ps. If you have the PHP iconv extension installed (and you probably do), consider using it to automatically convert your titles to ASCII.

1
On

Check for charset. If your sending form page uses:

<meta charset="utf-8"> 

will not work.

try use another encoding like

<meta charset="ISO-8859-1">

Here's a small sample code to test your replacement array:

<!DOCTYPE html>
<html>
<?php
if(isset($_POST["txt"])) 
{   
    echo '<head><meta charset="ISO-8859-1"></head><body>';

    $posted = $_POST["txt"]; 
    echo 'Received raw: ' . $posted .'<br/>';
    echo 'Received: ' . htmlspecialchars($posted).'<br/>';; 

    $ers = array( '<' => '', '>' => '', ' ' => '-',  'Ä' => 'Ae', 'Ö' => 'Oe', 'Ü' => 'Ue', 'ä' => 'ae', 'ö' => 'oe', 'ü' => 'ue', 'ß' => 'ss', '&' => 'und', '*' => '', ' - ' => '-', ',' => '', '.' => '', '!' => '', '?' => '' );

    $replaced = strtr($posted,$ers);   
    echo 'Replaced: ' . $replaced .'<br/>';  
}
else {
    ?>  
<head>
    <!--<meta charset="utf-8">--> <!--THIS ENCODING WILL NOT WORK -->
     <meta charset="ISO-8859-1">  <!--THIS WORKS FINE -->
</head>
<body> 
  <p>the text you want to replace here</p>
  <form action="#" method="post">
  Text: <input type="text" name="txt" value="">
  <input type="submit" value="Submit">
</form>

<?php   
}
?>  
</body>
</html>