I need to create a function in C, which finds out if 2 strings are made from same words. As can be seen in current code, I loaded each string in separate array. I made it that in the array there are words, all in lower case letters with just 1 space between each word and without all non-alpha characters. I though, that I could just sort the string and call strcmp on them, but it can't be done so, because of the reason, that there can be strings such as "dog dog dog cat" and "dog cat" , these strings are from same words, so the function should return 1, but it wouldnt if just sorted and used strcmp. So i though, I could merge all duplicated words in 1 and then sort and strcmp, but there is still one problem, that when there would be words such as "dog" and "god" , these are 2 different words, but the function would still take them as same after sorting. "dog dog dog cat" "dog cat" - same words "HI HeLLO!!'" "hi,,,hello hi" - same words I would be very thankful for any help. I really don't know how to create it, I sat at it for quite some time and still can't figure it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int sameWords( const char * a, const char * b)
{
char * array1=NULL;
char * array2=NULL;
int length1=0, length2=0, i=0, j=0;
while(a[i])
{
if(i>=length1)
{
length1+=250;
array1=(char*)malloc(length1*sizeof(char));
}
if(isspace(a[i]) && !isspace(a[i-1]))
{
array1[i]=a[i];
}
if(isalpha(a[i]))
{
array1[i]=tolower(a[i]);
}
i++;
}
while(b[j])
{
if(j>=length2)
{
length2+=250;
array2=(char*)malloc(length2*sizeof(char));
}
if(isspace(b[j]) && !isspace(b[j-1]))
{
array2[j]=b[j];
}
if(isalpha(b[j]))
{
array2[j]=tolower(b[j]);
}
j++;
}
}
int main()
{
sameWords("This' is string !!! ", "THIS stRing is !! string ");
return 0;
}
You have already learned two ways to go about your problem. The complicated one is to split each of the strings into words, sort them and then weed out duplicates, which is easy in a sorted array. The easier one is to split the first string into words, search for each word in the second. Then do the same the other way round: split the second and check for words in the first.
Both approaches require that you split the strings. That's also where you seem to have problems in your code. (You've got the basic idea to look at word boundaries, but you don't seem to know how to store the words.)
The basic question is: How are you going to represent the words, i.e. the substrings of a C string? There are various ways. You could use pointers into the string together with a string length or you could copy them into another buffer.
Here is a sloution that splits the string
a
into words and then checks whether each word can be found inb
:The current word is stored in the local char buffer
word
and has the lengthlen
. Note how the zero end marker'\0'
is added toword
manually before searchingb
forword
: The library functionstrstr
looks for a string in another one. Both strings must be zero-terminated.This is only one half of the solution. You must check the strings the other way round:
This is not yet the exact solution to your problem, because the string matching is done case-sensitively. I've skipped this part, because it was easier that way:
strstr
is case sensitive and I don't know of any variants that ignore the case.