More non-standard ways of allocating arbitary space on stack

126 Views Asked by At

Currently, I am using gcc, which allows the usage of a C variable length array in a C++ constexpr function (do not ask me why). This is, of course, not a standard behaviour, but results in the desired effect.

What I am searching for are more ways to emulate the desired behaviour in other compilers. As a standard fallback, I have included an option using unique_ptr, thus emulating gcc‘s behaviour on platforms that do not support constexpr and variable stack allocation.

Here is what I got so far:


class Integer {
private:
    int i;

public:
    constexpr
    Integer(const int& _t = 10)
    : i(_t)
    { }

    constexpr int
    get(void) const {
        return i;
    }
};

#if defined(__GNUC__) && !defined(__clang__)
#define XALLOC(type, name, count) type name[count]
#else
#include <memory>
#define XALLOC(type, name, count) std::unique_ptr<type[]> __##name(new type[count]); type* name = &__##name[0]
#endif

constexpr
int t(int num) {
    XALLOC(Integer, i, num);
    return i[0].get();
}

constinit int nu = t(10);

int main(int argc, const char *argv[]) {
    return t(0);
}
2

There are 2 best solutions below

13
On

I don't know if this will work in a constexpr function (haven't tried), but on most *nix platforms, you can allocate an arbitrary amount of space on the stack (well, up to a point...) with alloca, e.g.

void foo ()
{
    int *my_stuff = (int *) alloca (42);
    // Do things with my_stuff, it will go away automatically when `foo` returns
}

C-style cast used for brevity, and the MSVC equivalent is _alloca (which you can solve with a simple, conditionally defined, macro).

Please note that the Windows stack is only 1MB (although there's a linker flag to increase it), so mind how you go. Linux and others are 8MB, but again, there's probably a way to tweak it (and, in 64 bit builds, there's no real reason not to).

VLAs are non-standard and a little bit evil, do not use them.


An early Christmas present for @Red.Wave, this time not a poisoned chalice (sorry about that):

template <class T> class StackAllocWrapper
{
public:
    StackAllocWrapper (int n, T *data) : n (n), data (data) { }
    // ... access methods ...

private:
    int n = 0;
    T *data = nullptr;
};

#define ALLOCA_SANITISER(T, n) StackAllocWrapper <T> (n, (T *) alloca (n))

void foo ()
{
    auto w = ALLOCA_SANITISER (int, 42);
    // Do things with w, the allocation will go away automatically when `foo` returns
    (void) w;
}```
0
On

You can have different behaviour during constant evaluation with if consteval. For example, you can use new/unique_ptr during constant evaluation and a VLA/alloca during non-constant evaluation.

constexpr
int t(int num) {
    if consteval {  // (If not C++23, change to `if (std::is_constant_evaluated())`)
        std::unique_ptr<Integer[]> i = std::make_unique_for_overwrite<Integer[]>(num);
        return i[0].get();
    } else {
        __extension__ Integer i[num];
        // To avoid duplication, if the body of the function was longer
        // you could extract it to a lambda or a second function that takes a pointer
        return i[0].get();
    }
}

And you say in your comments that this is for a library, you might be able to make it so that the user of your library allocates the memory (which they could allocate as they wish based on their constraints and use case):

constexpr int t(std::span<Integer> i) {
    int num = i.size();

    return i[0].get();
}

// Possible uses:
static Integer large_array[MAX_SIZE];
t(std::span(large_array).subspan(0, num));

Integer vla[num];
t(std::span(vla+0, vla+num));

t(std::span(std::make_unique<Integer[]>(num).get(), num));

Here's a class that takes care of conditionally delete-ing during constant evaluation (and a macro that conditionally news or allocas):

#include <type_traits>
#include <algorithm>
#include <new>

template<typename T>
struct alloca_ptr {
    static_assert(std::is_trivially_destructible_v<T>, "Destructor will not be called when not in a constant evaluation");

    // Moving a stack allocation is probably a bug
    constexpr alloca_ptr(alloca_ptr&&) = delete;
    alloca_ptr& operator=(alloca_ptr&&) = delete;

    struct use_XALLOC_macro {  // (An internal type prevents accidentally constructing this type without the macro)
        T* ptr;
        constexpr explicit use_XALLOC_macro(T* ptr) : ptr(ptr) {}
        use_XALLOC_macro(const use_XALLOC_macro&) = delete;
    };

    [[nodiscard]] explicit constexpr alloca_ptr(use_XALLOC_macro allocation) noexcept : ptr(allocation.ptr) {}

    using element_type = T;
    using pointer = T*;
    constexpr pointer get() const noexcept { return ptr; }
    constexpr explicit operator bool() const noexcept { return ptr != nullptr; }
    constexpr element_type& operator[](std::size_t index) const noexcept { return ptr[index]; }

    constexpr ~alloca_ptr() {
        if consteval {
            ::delete[] ptr;
        }
    }

private:
    T* ptr;
};

// Unfortunately, `count` is evaluated twice so make sure it has no side effects
#define XALLOC(type, count) \
    alloca_ptr<type>(alloca_ptr<type>::use_XALLOC_macro( \
        std::is_constant_evaluated() ? ::new std::type_identity_t<type>[count] : \
        ::new ( \
            alignof(type) <= __BIGGEST_ALIGNMENT__ ? \
            __builtin_alloca(sizeof(type) * static_cast<std::size_t>(count)) : \
            __builtin_alloca_with_align(sizeof(type) * static_cast<std::size_t>(count), std::max(std::size_t{__BIGGEST_ALIGNMENT__}, alignof(type))) \
        ) std::type_identity_t<type>[count]))
constexpr
int t(int num) {
    auto i = XALLOC(Integer, num);
    return i[0].get();
}

This works on clang and GCC at least. You can extend this to compilers that do not have __builtin_alloca by checking __has_builtin(__builtin_alloca) and changing XALLOC to result in a unique_ptr<T[]> always, or replacing it with that compiler's spelling of alloca.

Note that alloca has a slightly different scope than a VLA. You don't want to use it in a loop, since while a VLA would die and be reused, the alloca isn't freed until the current stack frame is popped (when the function it is created in returns)