How clear duplicate consecutive non-alphabetic characters in a string?

84 Views Asked by At

Matching the string only for: ,.:- how I can remove duplicate values from a string? For example having:

"ab::::c ---------d,,,e ..........f ::a-b,,,c..d"

Expected output:

"ab:c -d,e .f :a-b,c.d" 
3

There are 3 best solutions below

0
On BEST ANSWER

Here we are using preg_replace to achieve desired output.

Regex: ([,.:-])\1+ Regex demo

Or

Regex: (,|\.|:|-)\1+Regex demo

1. This will match a character and add that in captured group

2. using that captured group for \1 more than one occurence.

Replacement: $1

Try this code snippet here

<?php
ini_set('display_errors', 1);

$string="ab::::c ---------d,,,e ..........f ::a-b,,,c..d";
echo preg_replace('/([,.:-])\1+/', '$1', $string);

Solution 2: using foreach loop

Try this code snippet here

$string="aab::::css ---------ddd,,,esddsff ..........f ::a-b,,,c..d";
$chars=  str_split($string);
$result=array();
foreach($chars as $character)
{
    if($character!=end($result) ||  !in_array($character, array(":",",",".","-")))
    {
        $result[]=$character;
    }
}
print_r(implode("",$result));
0
On

You can do this using preg_replace:

preg_replace — Perform a regular expression search and replace

$pattern = '/(\.|\,|\:|\-){2,}/';
$string = 'ab::::c ---------d,,,e ..........f ::a-b,,,c..d';
echo preg_replace($pattern, '$1', $string);

You can try your regular expressions here: https://regex101.com/

0
On

For future readers, for maximum efficiency do not use piped characters in your pattern. The methods that are using loops are also making too many iterated function calls and/or conditionals.

Input: $in="ab::::c ---------d,,,e ..........f ::a-b,,,c..d";

Method #1: one-liner using preg_replace() (note empty replacement string)

echo preg_replace('/([,.:-])\K\1+/','',$in);
//                          ^^ resets the start of the matched substring

Method #2: one-liner using preg_split() & implode()

echo implode(preg_split('/([,.:-])\K\1+/',$in));  // empty glue doesn't need mentioning

Output using either method:

ab:c -d,e .f :a-b,c.d

I wonder which method is most efficient on this page. If anyone would be so kind as to run and post a benchmark test with Sahil's 2 methods and my two methods, that would be very enlightening.


Here's a late consideration... If your string only has the problem of symbols repeating themselves before moving onto a valid character, then you can use this pattern: [-.,:]\K[-.,:]+ It will perform 50% faster than that all other patterns on this page and it offers the same output as the other methods on this page, but does stretch the interpretation of your question. Here are some examples that expose the difference:

ab:-,.c; will be reduced to ab:c
ab:-,.c -d.,.e--f will be reduced to ab:c -d.e-f

This may or may not be suitable for your project.