For some background, I am implementing a compiler using the llvmpy library which is a wrapper around the LLVM IR generation.
I have created a character type which represents one or more UTF-8 code points. These code points are stored in an array so a character can be one of the following arrays:
[1 x i32], [2 x i32], ..., [6 x i32]
Now, I would like to implement a string type. This would be an array of pointers to arrays:
[n x [1-6 x i32]*] where n is the string length
However, (as far as I know) it seems that LLVM requires me to declare the length of the inner array. So, while I can store this:
[[1 x i32], [1 x i32], [1 x i32]]
I cannot store this:
[[1 x i32], [2 x i32]]
Is there a way to store an array of array pointers if the array pointers lead to arrays of different length?
Much like in C, LLVM IR requires all the elements of an array to be of the same type.
I guess the simplest way to work around this is to just store some arbitrary pointer type (e.g.
i32*
), and performbitcast
s whenever you want to access the array - though that of course assumes that you know in advance the size of the internal array at each index.If it's only known at run-time, you can make each array element point to some
{ i32, i32* }
struct which holds the size of the internal array as well as a pointer to it, and thenswitch
on that size andbitcast
accordingly in each branch target - or just calculate the size at run-time from thei32*
pointer, which is easy as this is UTF-8.