So, I'm trying to build a string_split
function to split a c-style string based on a delimiter.
Here is the code for the function:
char** string_split(char* input, char delim)
{
char** split_strings = malloc(sizeof(char*));
char* charPtr;
size_t split_idx = 0;
int extend = 0;
for(charPtr = input; *charPtr != '\0'; ++charPtr)
{
if(*charPtr == delim || *(charPtr+1) == '\0')
{
if(*(charPtr+1) == '\0') extend = 1; //extend the range by one for the null byte at the end
char* string_element = calloc(1, sizeof(char));
for(size_t i = 0; input != charPtr+extend; ++input, ++i)
{
if(string_element[i] == '\0')
{
//allocate another char and add a null byte to the end
string_element = realloc(string_element, sizeof(char) * (sizeof(string_element)/sizeof(char) + 1));
string_element[i+1] = '\0';
}
string_element[i] = *input;
}
printf("string elem: %s\n", string_element);
split_strings[split_idx++] = string_element;
//allocate another c-string if we're not at the end of the input
split_strings = realloc(split_strings, sizeof(char*) *(sizeof(split_strings)/sizeof(char*) + 1));
//skip over the delimiter
input++;
extend = 0;
}
}
free(charPtr);
free(input);
return split_strings;
}
Essentially, the way it works is that there are two char*
, input
and charPtr
. charPtr
counts up from the start of the input string the the next instance of the delimiter, then input
counts from the previous instance of the delimiter (or the start of the input string), and copies each char
into a new char*
. once the string is built it is added to a char**
array.
There are also some twiddly bits for skipping over delimiters and dealing with the end points of the input string. the function is used as so:
int main()
{
char* str = "mon,tue,wed,thur,fri";
char delim = ',';
char** split = string_split(str, delim);
return 1;
}
Anyway, it works for the most part, except the first char*
in the returned char**
array is corrupted, and is just occupied by random junk.
For example printing the elements of split
from main
yields:
split: α↨▓
split: tue
split: wed
split: thur
split: fri
Whats odd is that the contents of split_strings[0]
, the array of char*
which returns the desired tokens is mon
, as it should be for this example, right up until the final loop of the main for-loop in the string_split
function, specifically its the line:
split_strings[split_idx++] = string_element;
which turns its contents from mon
to Junk. Any help is appreciated, thanks.
The final function for anyone wonder, should be pretty fool proof.