/* The following codes are compiled into library(test.a) */
typedef struct {
short x[5];
} weight_t;
typedef struct {
weight_t wgt;
char *ptr;
} myarray_t;
/* huge static array and 70% of entries are empty/zero */
static myarray_t array1[] ={
{ { 0x0102, 0x0102, 0x0102, 0x0102, 0x0102, }, 0 }, /* 0000 */
{ { 0x0103, 0x0103, 0x0103, 0x0103, 0x0103, }, 0 }, /* 0001 */
{ { 0x0104, 0x0104, 0x0104, 0x0104, 0x0104, }, 0 }, /* 0002 */
{ { 0x0105, 0x0105, 0x0105, 0x0105, 0x0105, }, 0 }, /* 0003 */
{ { 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, }, 0 }, /* 3134c */
{ { 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, }, 0 }, /* 3134d */
......
......
.....
} /* more than one millon entries */
Since array1 contains more than one million entries(even though 70% of them are empty/zero entries), the size of library file(test.a) is quite big.
I know using hash table to only include non-empty entries can reduce the size of array1 but is that any way to compress or do size optimization for library file(test.a) without changing array1
?
Since array1
has been referred by other libraries, revamping array1 to use other implementation might not be a solution for my current situation.
This is a followup to compress or size optimization for large sparse array in C which is basically asking the same thing, for file-size saving without changing the in-memory data structure. This one is tagged compiler-optimization, the other one isn't. (Editor's note: I don't think that makes them different enough to be non-duplicates; closing this unless we get clarification of what the difference in this question was supposed to be.)
One obvious size optimization is to get rid of struct padding. You could have padding between the 2x5 byte array and the pointer, and/or padding at the end of the struct. Likely you are either using 2x5 + 4 = 14 bytes or 2x5 + 8 = 18 bytes of actual data. Neither sits well with 4 or 8 byte alignent.
Therefore drop the struct:
This will remove all padding present in your current, inefficient struct. You'll save some 2-6 bytes per item, meaning some 2-6 Mb waste of space for a million items.
Example for gcc x86_64 Linux:
Output:
The change saved 6M bytes on this specific system. If you are stuck with the current struct for some backwards compatibility reasons, consider using some wrapper macros to emulate struct-like access while using raw arrays.