I'm really new to C programming and I try to make this as an example of reading files and saving them to dynamic array of structs the infos of the txt are:
Movie id:1448
title:The movie
surname of director: lorez
name of director: john
date: 3
month: september
year: 1997
the structs should be like this
typedef struct date
{
int day, month, year;
} date;
typedef struct director_info
{
char* director_surname, director_name;
} director_info;
typedef struct movie
{
int id;
char* title;
director_info* director;
date* release_date;
} movie;
All I know is that I should read it with fgets and i think this is some way but I cant figure out how I will make the structs and save them
FILE *readingText;
readingText = fopen("movies.txt", "r");
if (readingText == NULL)
{
printf("Can't open file\n");
return 1;
}
while (fgets(line, sizeof(line), readingText) != NULL)
{
.....
}
fclose(readingText);
Reading multi-line input can be a bit challenging and couple that with allocating for nested structures and you have a good learning experience for file I/O and dynamic memory allocation. But before looking at your task, there are some misconceptions to clean up:
Does NOT declare two pointers-to
char. It declares a single pointer (director_surname) and then a single character (director_name). Lesson, the unary'*'that indicates the level of pointer indirection goes with the variable NOT the type. Why? Just as you experienced:Does NOT declare three pointers-to
char, it declares one pointer and twocharvariables. Using:Makes that clear.
The Multi-Line Read
When you have to coordinate data from multiple lines, you must validate you obtain the needed information for each line in a group BEFORE you consider input for that group valid. There are a number of approaches, but perhaps one of the more straight-forward is simply to use temporary variables to hold each input, and keep a counter that you increment each time a successful input is received. If you fill all your temporary variables, and your counter reflects the correct number of input, you can then allocate memory for each of the structs and copy the temporary variable to their permanent storage. You then reset your counter to zero, and repeat until you run out of lines in your file.
Most of your reads are straight-forward, with the exception being the
monthwhich is read as a lower-case string for the given month which you must then convert tointfor storage in yourstruct date. Probably the easiest way to handle that is to create a lookup-table (e.g. a constant array of pointers to a string-literals for each of the twelve months). Then after reading your months string you can loop over the array usingstrcmp()to map the index for that months to your stuct. (adding+1to make, e.g.januarymonth1,februarymonth2, etc...) For example, you can use something like:Where the macro
NMONTHSis12for the number of elements inmonths.Then for reading your file, your basic approach will be to read each line with
fgets()and then parse the needed information from the line withsscanf()validating every input, conversion and allocation along the way. Validation is key to any successful piece of code and especially crucial for multi-line reads with conversions.For instance given your structs, you could declare your additional needed constants and declare and initialize your temporary variables, and open the file given as the first argument and validate it is open for reading with:
Above your
goodvariable will be your counter you increment for each good read and conversion of data from each of the seven lines of data that make up your input blocks. Whengood == 7you will have confirmed you have all the data associated with one movie and you can allocate and fill final storage with all the temporary values.The
usedandavailcounters track how many allocatedstruct movieyou have available and out of that how many are used. Whenused == avail, you know it is time torealloc()your block of movies to add more. That's how dynamic allocations schemes work. You allocate some anticipated number of object you need. You loop reading and filling object until you fill up what you have allocated, then you reallocate more and keep going.You can add as much additional memory each time as you want, but the general scheme is to double your allocated memory each time a reallocation is needed. That provides a good balance between the number of allocations required and the growth of the number of objects available.
(memory operations are relatively expensive, you want to avoid allocating for each new outer struct -- though allocation has gotten a bit better in extending rather than creating new and copying each time, using a scheme that allocates larger blocks will still be a more efficient approach in the end)
Now with your temporary variables and counter declared, you can start your multi-line read. Let's take the first
idline as an example:You read the line and check if
good == 0to coordinate the read with theidline. You attempt a conversion tointand validate both. If you successfully store an integer in your temporary id, you increment yourgoodcounter.Your read of the Title line will be similar, except this time it will be an
else ifinstead of a plainif. Theidline above and the read oftitlefrom the next line would be:(note: any time you read a character string into any array with any of the
scanf()family of functions, you must use the field-width modifier (127above) to limit the read to what your array can hold (+1 for'\0') to protect your array bounds from being overwritten. If you fail to include the field-width modifier, then the use of thescanf()function are no safer than usinggets(). See: Why gets() is so dangerous it should never be used!)With each line read and successfully converted and stored,
goodwill be increment to set up the read of the next line's values into the proper temporary variable.Note I said you have a bit more work to do with the
monthread and conversion due to reading, e.g."september", but needing to store the integer9in your struct. Using your lookup-table from the beginning, you would read and obtain the string for the month name and then loop to find the index in your lookup-table (you will want to add+1to the index so thatjanuary == 1, and so on). You could do it like:After your last
else iffor theyear, you include anelseso that any failure of any one line in the block will resetgood = 0;so it will attempt to read and match of the nextidline in the file, e.g.Dynamic Allocation
Dynamic allocation for your nested structs isn't hard, but you must keep clear in your mind how you will approach it. Your outer struct,
struct movieis the one you will allocate and reallocate usingused == avail, etc... You will have to allocate forstruct dateandstruct director_infoeach time you have all seven of your temporary variables filled and validated and ready to be put in final storage. You would start your allocation block by checking if yourstruct movieblock had been allocated yet, if not allocate it. If it had, andused == avail, you reallocate it.Now every time you
realloc()you use a temporary pointer, so when (not if)realloc()fails returningNULL, you don't lose your pointer to the currently allocated storage by overwriting it with theNULLreturned -- creating a memory-leak. That initial handling of allocating or reallocating for yourstruct moviewould look like:Now you have a valid block of
struct moviewhere you can directly store anidand allocate for thetitleand assign the allocated block holding the title to yourtitlepointer in eachstruct movieworth of storage. We allocate twostruct movieto begin with. When you startused == 0andavail = 2(see theAVAILconstant at the top for where the2comes from). Handlingidand allocating fortitlewould work as:(note: when you declare multiple struct in a block of memory and use
[..]to index each struct, the[..]acts as a dereference of the pointer, so you use the'.'operator to access the member following the[..], not the'->'operator as you normally would to derefernce a struct pointer to access the member (the derefernce is already done by[..])Also, since you know the
len, there is no reason to usestrcpy()to copytmptitletomovies[used].titleand havestrcpy()scan the string looking for the nul-terminating character at the end. You already know the number of characters, so just usememcpy()to copylen + 1bytes. (note if you havestrdup()you can allocate and copy in a single-call, but notestrdup()isn't part of the c library in C11.The allocation for your
struct director_infofor eachstruct movieelement is straight-forward. You allocate thestruct director_infoand then usestrlen()to get the length of the names and then allocate storage for each andmemcpy()as we did above.Handling allocation and filling the new
struct dateis even easier. You just allocate and assign 3 integer values and then assign the address for the allocatedstruct dateto the pointer in yourstruct movie, e.g.That's it, you increment
used++when you assign the last pointer in yourstruct movieso you are set up to fill the next element in that block with the next seven lines from the file. You resetgood = 0;to ready the read loop to read the nextidline from the file.Putting It Altogether
If you fill in the pieces putting the code altogether, you would end up with something similar to:
(note: the addition of
prnmovies()to output all stored movies andfreemovies()to free all allocated memory)Example Input File
Rather than just one block of seven lines for one movie, let's add another to make sure the code will loop through a file, e.g.
Example Use/Output
Processing the input file with two movies worth of data in the filename
dat/moviegroups.txtyou would have:Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux
valgrindis the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.Always confirm that you have freed all memory you have allocated and that there are no memory errors.
There is a LOT of information in this answer (and it always turns out longer than I anticipated), but to give a fair explanation of what is going on takes a little time. Go though it slowly, understand what each bit of code is doing and understand how the allocations are handled (it will take time to digest). If you get stuck, drop a comment and I'm happy to explain further.