removing characters from array after sorting in php

1.7k Views Asked by At

I am sorting data alphabetically in PHP from a text file which works perfect, but unfortunately the textfile, which is automatically populated, contains characters like #039; which I want to have removed from the endresult. tried numerous things to replace and remove characters but was not successfull. This is what I have so far :

    <?php
error_reporting(E_ALL);

$fileName = 'cache/_city.txt';
$data     = file_get_contents($fileName);

// Assuming that the file had the data in one line...

// Split the data using the delimiter
$split = explode("|", $data);

// Sort
sort($split);

// Put it all back together
$data = implode("&nbsp", $split);
$data = str_replace("&#039;" , "", $data);

echo $data;

?> 

How do I remove this piece of text from $data : #039;

Sample data :

<a href="?city=Leiden">Leiden</a>|
<a href="?city=Papendrecht">Papendrecht</a>|
<a href="?city=Helmond">Helmond</a>|
<a href="?city=%26%23039%3Bs-Hertogenbosch">&amp;#039;s-Hertogenbosch</a>|
<a href="?city=Hengelo">Hengelo</a>|
<a href="?city=Marknesse">Marknesse</a>|
<a href="?city=Wanssum">Wanssum</a>|
<a href="?city=Rijswijk">Rijswijk</a>|
<a href="?city=Leunen">Leunen</a>|
<a href="?city=Genemuiden">Genemuiden</a>|
2

There are 2 best solutions below

5
On BEST ANSWER

Did you try something like that:

$data = str_replace($wrongChar , "", $data);

Edit:

Can you test that even if I think you will 'clean' more than you need:

$data = file_get_contents($fileName);
$data = preg_replace('/[^A-Za-z0-9\-]/', '', $data);

Second edition:

Knowing that *_replace is working, I improved a bit my suggestion.

<?php

error_reporting(E_ALL);

// It will apply html_entity_decode serveral times on the string to convert all HTML entities
$recursive_decode = function($str, $depth = 1) use (&$recursive_decode) {
    if (!$depth) {
        return $str;
    }

    return $recursive_decode(html_entity_decode($str, ENT_QUOTES, 'UTF-8'), --$depth);
};

$fileName = 'cache/_city.txt';

// In this test, try with a depth egals to 2
$data     = $recursive_decode(file_get_contents($fileName), 2);

// Assuming that the file had the data in one line...

// Split the data using the delimiter
$split = explode('|', $data);

// Sort
sort($split);

// Put it all back together
$data = implode("&nbsp", $split);

// Because recursive_decode converted all entities, your previous "&#039" is now "'"
$data = str_replace("'" , "", $data);

echo $data;

?>
0
On

There isn't enough information in the question on what you want to replace. This will basically determine the answer.

If you want to replace only a few specific characters, it might be best to use str_replace or a variant thereof, but if it's a number of 'junk' characters (implied in your answer), you can replace a Unicode range (with preg_replace), for example. Someone asked and got a simple answer here: How do I replace characters not in range [0x5E10, 0x7F35] with '*' in PHP?

Function reference:

https://secure.php.net/manual/en/function.str-replace.php https://secure.php.net/manual/en/function.preg-replace.php

Side note: You should use &nbsp;, not &nbsp.

Edit: With the new info you provided, it seems like you're trying to remove a character that has been encoded, try: str_replace('&#039;', '', urldecode($data))