How to control the size of a class to be a multiple of the size of a member?

408 Views Asked by At

I have the following class,

struct employee {
    std::string name;
    short salary;
    std::size_t age;
};

Just as an example, in Linux amd64, the size of the struct is 48 bytes, and the size of std::string is 32, that is, not a multiple.

Now, I need, in a cross-platform way, for employee to have a size that is a multiple of the size of std::string (first member).

(Cross-platform could mean, for example, both Linux amd64 and Apple ARM.)

That is, sizeof(employee) % sizeof(std::string) == 0.

I tried controlling the padding using alignas for the whole class or the members, but the requirement to be a power of 2 is too restrictive, it seems.

Then I tried to add a char array at the end. Still, I had two problems, first, what is the exact size of the array in different platforms at compile-time, and second not adding another member that can screw up the nice aggregate initialization of the class.

For the first, I do this:

struct employee_dummy {
    std::string name;
    short salary;
    std::size_t age;
};

struct employee {
    std::string name;
    short salary;
    std::size_t age;
    char padding[(sizeof(employee_dummy)/sizeof(std::string)+1)*sizeof(std::string) - sizeof(employee_dummy)];
};

Note the ugly dummy class, and I don't even know if the logic is correct.

For the second problem, I don't have a solution. I could do this, but then I would need to add a constructor, the class would not be an aggregate, etc.

struct employee {
    std::string name;
    short salary;
    std::size_t age;
 private:
    char padding[(sizeof(employee_dummy)/sizeof(std::string)+1)*sizeof(std::string) - sizeof(employee_dummy)];
};

How can I control the size of the struct with standard or non-standard mechanisms and keep the class as an aggregate?

Here is a link to play with this problem empirically: https://cppinsights.io/s/f2fb5239


NOTE ADDED:

I realized that, if the technique to add padding is correct, the calculation is even more difficult because the dummy class might be already adding padding, so I have to take into account the offset of the last element instead.

In this example I want data to be a multiple of the first member (std::complex):

struct dummy {
    std::complex<double> a;
    double b;
    std::int64_t b2;
    int c;
};

struct data {
    std::complex<double> a;
    double b;
    std::int64_t b2;
    int c;
    char padding[ ((offsetof(dummy, c) + sizeof(c)) / sizeof(std::complex<double>) + 1)* sizeof(std::complex<double>) - (offsetof(dummy, c) + sizeof(c)) ];
};

Note the formula is even worse now.

4

There are 4 best solutions below

2
n. m. could be an AI On BEST ANSWER

Here is a standard-compliant, no ifs or buts, version.

template <template<std::size_t> class tmpl, std::size_t need_multiple_of>
struct adjust_padding
{
    template <std::size_t n>
    static constexpr std::size_t padding_size()
    {
        if constexpr (sizeof(tmpl<n>) % need_multiple_of == 0) return n;
        else return padding_size<n+1>();
    }

    using type = tmpl<padding_size<0>()>;
};

Use it like this:

template <std::size_t K>
struct need_strided
{
    double x;
    const char pad[K];
};

template <>
struct need_strided<0>
{
    double x;
};

using strided = adjust_padding<need_strided, 47>::type;

Now strided has a size that is a multiple of 47 (and of course is aligned correctly). On my computer it is 376.

You can make employee a template in this fashion:

template <std::size_t K>
struct employee { ...

or make it a member of a template (instead of double x):

template <std::size_t K>
struct employee_wrapper { 
   employee e;
   

and then use employee_wrapper as a vector element. But provide a specialization for 0 either way.

You can try using std::array instead of a C-style array and avoid providing a specialization for 0, but it may or may not get optimized out when the size is 0. [[no_unique_address]] (C++20) may help here.

Note, something like adjust_padding<need_strided, 117>::type may overflow the default constexpr depth of your compiler.

0
VonC On

Given the context provided by the comments, the OP's primary goal is to create a struct where the size is a multiple of the size of std::string to facilitate treating an array of such structs as an array of std::string with a certain stride for legacy compatibility reasons.

The C++ standard does not define the layout of memory in such a way that you could safely stride through an array of objects as if it were an array of one of its members.
That is due to padding, alignment requirements, and potential undefined behavior when accessing out-of-bounds memory or creating pointers that do not point to the beginning of an object.

I tried this JDoodle SizeAdjuster project as a simpler approach:

#include <iostream>
#include <string>
#include <type_traits>

// Original employee struct with `salary` as an int
struct employee {
    std::string name;
    int salary;  // Changed from short to int
    std::size_t age;
};


// SizeAdjuster to make the size of a struct a multiple of the size of a member
template <typename T, typename MemberT>
struct SizeAdjuster {
    T data;
    // Calculate padding needed to make the size of T a multiple of the size of MemberT.
    char padding[(-sizeof(T)) % sizeof(MemberT)];
};

// Adjusted employee struct with padding
using employee_adjusted = SizeAdjuster<employee, std::string>;

int main() {
    // Create an employee object with adjusted size, salary is now an int, so 50000 is valid.
    employee_adjusted emp_adj = {{"John Doe", 50000, 30}};

    // Access and display employee data
    std::cout << "Name: " << emp_adj.data.name << std::endl;
    std::cout << "Salary: " << emp_adj.data.salary << std::endl;
    std::cout << "Age: " << emp_adj.data.age << std::endl;

    // Display the size of the adjusted employee object
    std::cout << "Size of employee: " << sizeof(employee) << std::endl;
    std::cout << "Size of std::string: " << sizeof(std::string) << std::endl;
    std::cout << "Size of employee_adjusted: " << sizeof(employee_adjusted) << std::endl;

    // Check if the size of employee_adjusted is a multiple of the size of std::string
    std::cout << "Is size of employee_adjusted a multiple of the size of std::string? "
              << (sizeof(employee_adjusted) % sizeof(std::string) == 0 ? "Yes" : "No") << std::endl;

    return 0;
}

That would take into account the issue with zero-sized arrays and attempts to avoid undefined behavior by ensuring that we always have a well-defined padding array. However, even with this adjustment, the act of striding through an array of employee_adjusted objects as if it were an array of std::string objects may still invoke undefined behavior.

The goal of the SizeAdjuster is to make sure the size of the employee_adjusted struct is a multiple of the size of std::string. In the given output:

Size of employee: 48
Size of std::string: 32
Size of employee_adjusted: 64
Is size of employee_adjusted a multiple of the size of std::string? Yes

That indicates that the original employee struct has a size of 48 bytes, and the size of std::string is 32 bytes.
The SizeAdjuster has done its job by adding enough padding to the employee struct to make the total size a multiple of the size of std::string, which is 64 bytes in this case. Since 64 is a multiple of 32, the condition sizeof(employee_adjusted) % sizeof(std::string) == 0 is true, which is what we wanted to achieve.

The purpose of making the size of employee_adjusted a multiple of the size of std::string is to align the memory layout in such a way that when you have an array of employee_adjusted objects, you could theoretically access the name member directly at consistent strides. In the OP's context, this is intended for compatibility with legacy systems or frameworks that expect such memory alignment.

However, it is important to reiterate that while we can make sure the size of the struct is a multiple of the size of std::string, accessing the memory as if it were an array of std::string is not guaranteed to be safe or standard-compliant. The standard does not support treating an array of employee_adjusted as an array of std::string due to potential strict aliasing violations and alignment issues.

In practice, the SizeAdjuster project demonstrates how to calculate and apply padding to achieve a size requirement, but it does not endorse or implement unsafe memory access patterns.
The proper way to access the name member of each employee_adjusted in an array would still be through the employee object itself, not by treating the memory as an array of std::string.

The output illustrates that the size adjustment has been achieved, but it does not illustrate safe or standard-compliant striding access, which is beyond the capabilities of size adjustment alone.

3
Sedenion On

There is no need to make employee a template or to list its members twice. TL;DR: You can get down to succint code such as:

#pragma pack(1) // Prevent padding at the end.
struct employee_base {
    std::complex<double> a;
};

using employee = PaddingHelper<employee_base>;

Explanation follows.


The core idea is that (since C++17) you can simply derive from the type that contains the actual payload, and keep the aggregate initialization syntax (live on godbolt):

#pragma pack(1) // Prevent padding at the end of 'dummy'.
struct dummy {
    std::complex<double> a;
    double b;
    std::int64_t b2;
    bool c;
};

struct employee : dummy {
    char padding[
        (sizeof(dummy)/sizeof(decltype(dummy::a))+1)*sizeof(dummy::a) 
        - sizeof(dummy)]
        = {0};
};

static_assert(sizeof(employee) % sizeof(decltype(dummy::a)) == 0);

int main(){
    employee e{{41, 42, 43, false}};
}

The #pragma pack is of course non-standard, but gcc, clang and MSVC support it. We need it to ensure that we have control of the padding ourselves.

Note that if the base class has only a single member (or its size is a multiple of the first member), this adds additional padding. To fix this, we can exploit the empty base class optimization (live on godbolt):

#pragma pack(1)
template <std::size_t size>
struct padding{
    char pad[size] = {};
};

template <>
struct padding<0>{};

template <class BaseClass, class FirstMember>
constexpr auto GetPaddingSize()
{
    constexpr auto Size = 
        (sizeof(BaseClass) % sizeof(FirstMember) == 0
            ? 0 
            : (sizeof(BaseClass)/sizeof(FirstMember)+1)*sizeof(FirstMember) 
                - sizeof(BaseClass));
    return Size;
}

#pragma pack(1) // Prevent padding at the end of 'dummy'.
struct dummy {
    std::complex<double> a;
};

struct employee : dummy, padding<GetPaddingSize<dummy, decltype(dummy::a)>()> {
};

static_assert(sizeof(employee) % sizeof(decltype(dummy::a)) == 0);
static_assert(sizeof(employee) == sizeof(decltype(dummy::a)));

int main(){
    [[maybe_unused]] employee e{{41}};
}

Since you are dealing with aggregate types, you could use boost::pfr to get rid of the explicit specification of the first member via boost::pfr::tuple_element_t<0, BaseClass> (godbolt):

#include <boost/pfr.hpp>

#pragma pack(1)
template <std::size_t size>
struct padding{
    char pad[size] = {};
};

template <>
struct padding<0>{};

template <class BaseClass>
constexpr auto GetPaddingSize()
{
    using FirstMember = boost::pfr::tuple_element_t<0, BaseClass>;
    constexpr auto Size = 
        (sizeof(BaseClass) % sizeof(FirstMember) == 0
            ? 0 
            : (sizeof(BaseClass)/sizeof(FirstMember)+1)*sizeof(FirstMember) 
                - sizeof(BaseClass));
    return Size;
}

//--------------------

#pragma pack(1) // Prevent padding at the end.
struct employee_base {
    std::complex<double> a;
};

struct employee : employee_base, padding<GetPaddingSize<employee_base>()> {
};

static_assert(sizeof(employee) % sizeof(decltype(employee::a)) == 0);
static_assert(sizeof(employee) == sizeof(decltype(employee::a)));

//---------------------

#pragma pack(1) // Prevent padding at the end.
struct foo_base {
    std::size_t i;
    bool b;
};

struct foo : foo_base, padding<GetPaddingSize<foo_base>()> {
};

static_assert(sizeof(foo) % sizeof(decltype(foo::i)) == 0);

//---------------------

int main(){
    [[maybe_unused]] employee e{{41}};
    [[maybe_unused]] foo f{{41, false}};
}

And then you could introduce

template <class BaseClass>
struct PaddingHelper : BaseClass, padding<GetPaddingSize<BaseClass>()> {};

allowing to write e.g.

#pragma pack(1) // Prevent padding at the end.
struct employee_base {
    std::complex<double> a;
};

using employee = PaddingHelper<employee_base>;

See on godbolt. Of course, you can do the analogous thing without boost::pfr, you just have to add a second template argument to PaddingHelper that receives the first member type. I don't think it can become more succinct than this.

2
stackoverblown On

Try this

   struct employee {
       std::string name;
       union {  
          struct {
              short salary;
              std::size_t age;
          };
          std::string dummy;
       };
   };

This should be 2 * sizeof(std::string) on any platforms or compilers. But just in case, someone should tell me why this is not the case and on which platform will this fail! Thanks!

This can easily be generalized to collection of data field of any size. As long as there is enough copies of std::string to cover the size of the total in the original struct to enforce the size of N * sizeof(std::string) in the altered struct where N is an integer.