C program to remove consecutive repeated characters from string

1.2k Views Asked by At

The code: https://pastebin.com/nW6A49ck

/* C program to remove consecutive repeated characters from string. */
 
#include <stdio.h>
 
int main() {
    char str[100];
    int i, j, len, len1;

    /* read string */
    printf("Enter any string: ");
    gets(str);
 
    /* calculating length */
    for (len = 0; str[len] != '\0'; len++);
 
    /* assign 0 to len1 - length of removed characters */
    len1 = 0;
 
    /* Removing consecutive repeated characters from string */
    for (i = 0; i < (len - len1);) {
        if (str[i] == str[i + 1]) {
            /* shift all characters */
            for (j = i; j < (len - len1); j++)
                str[j] = str[j + 1];
            len1++;
        } else {
            i++;
        }
    }
 
    printf("String after removing characters: %s\n", str);
    return 0;
}

The problem: Lets say I have the string 'Hello' as an input..I want the two ls to be both removed (not only 1)... Same for 'Helllo' (I want the 3 ls to be removed and not just the 2 ls)... How can I do that?

if (str[i] == str[i + 1]) {
    /* shift all characters */
    for (j = i; j < (len - len1); j++)
        str[j] = str[j + 1];
    len1++;
}

Maybe I can count the times every character is repeated and then in line 28 replace 1 with the the times a character is repeated? But how can I implement this to the code?

3

There are 3 best solutions below

0
On

This code snippet should remove all consecutive characters out of your string (note that some C compilers won't let you declare variables within the internal blocks):

for (int i=0; i<len; i++) {
    int j = i, repeats = 1;
    while (j < len-1 && str[j] == str[++j])
    {
        repeats++;
    }
    if (repeats > 1) {
        for (j = i; j < len - repeats; j++)
        {
            str[j] = str[j + repeats];
        }
        len -= repeats;
        i--;
        str[len] = '\0';
    }
}
6
On

Links are discouraged, instead, you should post the contents of link. Also, for such kind of problem, I will suggest first come up with an appropriate algorithm and then implement it. At time, you will find it much more easier than taking someone else's code and making changes to it make it work as per your need.

Algorithm:

Step I: Record the position where the letter to be written in the string (calling this position - P). Initially, it will be start of string.

Step II: If current processing character is same as it's next character, then

  • Dont make any change in P.
  • Set a flag to skip next character (calling this flag - F).

Step III: If current processing character and next character are different, then

  • If flag F is set, skip this character, reset flag F and don't change P.
  • If flag F is not set then write this character at position P in the string and set P to next position.

Step IV: Move to next character in the string and go to Step II.

Implementation:

#include <stdio.h>
#include <string.h>
#include <ctype.h>

void remove_all_consecutive_dup_chars (char * pstr) {
    if (pstr == NULL) {
        printf ("Invalid input..\n");
        return;
    }

    /* Pointer to keep track of position where next 
     * character to be write.
     */
    char * p = pstr;
    int skip_letter = 0;

    for (unsigned int i = 0; pstr[i] ; ++i) {
        /* Using tolower() to identify the consecutive characters 
         * which are same and only differ in case (upper/lower).
         */
        if ((tolower (pstr[i]) == tolower (pstr[i + 1]))) {
            skip_letter = 1;
            continue;
        }

        if (skip_letter) {
            skip_letter = 0;
        } else {
            *p++ = pstr[i];
        }
    }

    /* Add the null terminating character. 
     */
    *p = '\0';
}

int main (void) {
    char buf[256] = {'\0'};

    strcpy (buf, "WELL, well, welLlLl....");
    printf ("%s ----> ", buf);
    remove_all_consecutive_dup_chars (buf);
    printf ("%s\n", buf);

    strcpy (buf, "Hello");
    printf ("%s ----> ", buf);
    remove_all_consecutive_dup_chars (buf);
    printf ("%s\n", buf);

    strcpy (buf, "Helllo");
    printf ("%s ----> ", buf);
    remove_all_consecutive_dup_chars (buf);
    printf ("%s\n", buf);

    strcpy (buf, "aAaaaA    ZZz");
    printf ("%s ----> ", buf);
    remove_all_consecutive_dup_chars (buf);
    printf ("%s\n", buf);
    
    return 0;
}

Output:

# ./a.out
WELL, well, welLlLl.... ----> WE, we, we
Hello ----> Heo
Helllo ----> Heo
aAaaaA    ZZz ----> 

EDIT:

In above program, I have used tolower() with an assumption that the string, passed as argument to remove_all_consecutive_dup_chars(), will contain only alphabets - [A - Z]/[a - z] and space character.
Note that, tolower() can result in UB if pstr[i] < 0. If you use tolower(), just make sure that argument you pass to tolower() shall be representable as an unsigned char.

3
On

You could make a function to remove the ranges with equal characters by copying character by character to a separate pointer in the string that you do not step forward if repeating characters are found:

void foo(char *str) {
    for(char *wr = str; (*wr = *str) != '\0';) {  // copy until `\0` is copied
        ++str;                 // step to the next character
        if(*wr != *str) {      // if the next char is not equal to `*wr`
            ++wr;              // step `wr` forward to save the copied character
        } else do { 
            ++str;             // `*wr == *str`, so step `str` forward...
        } while(*wr == *str);  // ...until a different character is found
    }
}
  • *wr = *str copies the current character str is pointing at to where wr is currently poining. The != '\0' check makes the loop end when \0 (the null terminator) has been copied.
  • After that str is increased to point at the next character.
  • If the next character is not equal to the one which was just copied, increase wr to save that copied character.
  • If the next character was indeed equal to the one being copied, don't increase wr to let it be overritten by the next character being copied and step str forward until a different character is found.

Demo

A dense version doing exactly the same thing:

void foo(char *str) {
    for(char *wr = str; (*wr = *str) != '\0';) {
        if(*wr != *++str) ++wr;
        else while(*wr == *++str);
    }
}