DEV Community

Anna Voronina
Anna Voronina

Posted on

Silent foe or quiet ally: Brief guide to alignment in C++. Part 2

It seems like we've already revealed the secret of alignment and defeated an invisible enemy—unaligned access. Memory is under control, but performance still whispers, "Don't forget about nuances." What nuances, huh? Let's see what happens when structures begin to inherit from each other and the game rules are changing.

1340_Alignment2/image1.png

Introduction

So, the path is clear: we look closer at the innards of inheritance to see what it does with memory. Let's get straight to the point, but be ready, rules just get more complicated.

How inheritance impacts alignment

In such cases, data alignment in memory becomes more intriguing than in ordinary structures. The compiler must factor in both data members of a current class and data members of all its parents, while preserving their declaration order.

Inheritance of POD structures

Classes/structures in C++ have two key properties: they're trivial and have standard memory allocation. In the past, classes/structures that satisfied both properties were called POD (Plain Old Data). An important feature of POD classes/structures is that they lie in memory exactly as we defined them.

Let's start with inheriting POD structures as the simplest option:

struct Base
{
  char* ptr;
  int a;
  char b;
};

struct Example : Base
{
  char array[3];
};
Enter fullscreen mode Exit fullscreen mode

Full code fragment.
#include <iostream>
#include <cstdint>
#include <format>

struct Base
{
  char* ptr;
  int a;
  char b;
};

struct Example : Base
{
  char array[3];
};

int main()
{
  Example obj;
  std::cout << "=== Print size and alignment of "
               "struct derived from POD ==="
            << std::endl
            << std::format("Sizeof of Base: {} byte(s)",
                           sizeof(Base))
            << std::endl
            << std::format("Alignment of Base: {} byte(s)",
                           alignof(Base))
            << std::endl
            << std::format("Sizeof of Example: {} bytes",
                           sizeof(struct Example))
            << std::endl
            << std::format("Alignment of Example: {} byte(s)",
                           alignof(Example))
            << std::endl << std::endl
            << "=== Addresses ===" << std::endl
            << std::format("Base address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<Base *>(&obj)
                           ))
            << std::endl
            << std::format("Base first "
                           "data member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.ptr))
            << std::endl
            << std::format("Example first "
                           "data member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.array))
            << std::endl;
}
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with MSVC.
Compiler Explorer: https://godbolt.org/z/zMEfvGqnT
=== Print size and alignment of struct derived from POD ===
Sizeof of Base: 16 byte(s)
Alignment of Base: 8 byte(s)
Sizeof of Example: 24 bytes
Alignment of Example: 8 byte(s)

=== Addresses ===
Base address: 0x9d5ecffc70
Base first data member address: 0x9d5ecffc70
Example first data member address: 0x9d5ecffc80
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with Clang.
Compiler Explorer: https://godbolt.org/z/PnWsTY8sc
=== Print size and alignment of struct derived from POD ===
Sizeof of Base: 16 byte(s)
Alignment of Base: 8 byte(s)
Sizeof of Example: 24 bytes
Alignment of Example: 8 byte(s)

=== Addresses ===
Base address: 0x7ffdbb57caa0
Base first data member address: 0x7ffdbb57caa0
Example first data member address: 0x7ffdbb57cab0
Enter fullscreen mode Exit fullscreen mode

The inherited POD structure causes the following implicit composition: the complete base structure is added to the beginning of the derived structure, as if it's the first data member. Here, compilers can't optimize anything, since the POD structure has the property of standard memory allocation.

Inheritance of non-POD structure/class

When the class or structure inherits from a non-POD class or structure, the rules for data alignment slightly change.

Look at the example:

struct Base
{
  Base();

  char* ptr;
  int a;
  char b;
};

struct Example : Base
{
  char array[3];
};
Enter fullscreen mode Exit fullscreen mode

Full code fragment.
#include <iostream>
#include <cstdint>
#include <format>

struct Base
{
  Base();

  char* ptr;
  int a;
  char b;
};

struct Example : Base
{
  char array[3];
};

int main()
{
  Example obj;
  std::cout << "=== Print size and alignment of "
               "struct derived from non-POD ==="
            << std::endl
            << std::format("Sizeof of Base: {} byte(s)",
                           sizeof(Base))
            << std::endl
            << std::format("Alignment of Base: {} byte(s)",
                           alignof(Base))
            << std::endl
            << std::format("Sizeof of Example: {} bytes",
                           sizeof(struct Example))
            << std::endl
            << std::format("Alignment of Example: {} byte(s)",
                           alignof(Example))
            << std::endl << std::endl
            << "=== Addresses ===" << std::endl
            << std::format("Base address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<Base *>(&obj)
                           ))
            << std::endl
            << std::format("Base first "
                           "data member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.ptr))
            << std::endl
            << std::format("Example first "
                           "data member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.array))
            << std::endl;
}
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with MSVC.
Compiler Explorer: https://godbolt.org/z/3o8dxK3Pf
=== Print size and alignment of struct derived from non-POD ===
Sizeof of Base: 16 byte(s)
Alignment of Base: 8 byte(s)
Sizeof of Example: 24 bytes
Alignment of Example: 8 byte(s)

=== Addresses ===
Base address: 0x84c88ff970
Base first data member address: 0x84c88ff970
Example first data member address: 0x84c88ff980
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with Clang.
Compiler Explorer: https://godbolt.org/z/EePET54bY
=== Print size and alignment of struct derived from non-POD ===
Sizeof of Base: 16 byte(s)
Alignment of Base: 8 byte(s)
Sizeof of Example: 16 bytes
Alignment of Example: 8 byte(s)

=== Addresses ===
Base address: 0x7ffc8f937638
Base first data member address: 0x7ffc8f937638
Example first data member address: 0x7ffc8f937645
Enter fullscreen mode Exit fullscreen mode

The MSVC compiler hasn't changed things: we still have implicit composition. But with Clang, something magical happened: the size of the derived class is 8 bytes smaller. Now let's remember what dsize is, which we introduced when discussing the data member alignment rules. The value of dsize is the size of the class/structure without the final alignment. And in the case of inheritance from a non-POD structure/class, Clang removes the final alignment to pack the following base classes or data members more compactly.

All your databases belong to us (empty base optimization)

What if we told you that under certain conditions, the base class may not take any space in the derived class at all? Let's create an empty base class without a single data member, but with member functions, and then create a derived class. Remember that each object must have a unique address, so the class holds one byte. But here's what happens during inheritance:

struct EmptyBase
{
  void show() { std::cout << "Empty" << std::endl; }
};

struct Example : EmptyBase
{
  char* ptr;
  long value;
  short number;
  char symbol;
};
Enter fullscreen mode Exit fullscreen mode

Full code fragment.
#include <iostream>
#include <cstdint>
#include <format>

struct EmptyBase
{
  void show() { std::cout << "Empty" << std::endl; }
};

struct Example : EmptyBase
{
  char* ptr;
  int value;
  short number;
  char pair[2];
};

int main()
{
  Example obj;
  std::cout << "=== Print Empty Base Optimization ==="
            << std::endl
            << std::format("EmptyBase size: {} byte(s)",
                           sizeof(EmptyBase))
            << std::endl
            << std::format("EmptyBase alignment: {} byte(s)",
                           alignof(EmptyBase))
            << std::endl
            << std::format("Example size: {} byte(s)",
                           sizeof(Example))
            << std::endl
            << std::format("Example alignment: {} byte(s)",
                           alignof(Example))
            << std::endl << std::endl
            << "=== Addresses ===" << std::endl
            << std::format("Object address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj))
            << std::endl
            << std::format("EmptyBase address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<EmptyBase *>(&obj)
                           ))
            << std::endl
            << std::format("Example first "
                           "data member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.ptr))
            << std::endl;
}
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with MSVC.
Compiler Explorer: https://godbolt.org/z/o6TGcoKcq
=== Print Empty Base Optimization ===
EmptyBase size: 1 byte(s)
EmptyBase alignment: 1 byte(s)
Example size: 16 byte(s)
Example alignment: 8 byte(s)

=== Addresses ===
Object address: 0x77b3aff7d0
EmptyBase address: 0x77b3aff7d0
Example first data member address: 0x77b3aff7d0
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with Clang.
Compiler Explorer: https://godbolt.org/z/GWch15ha4
=== Print Empty Base Optimization ===
EmptyBase size: 1 byte(s)
EmptyBase alignment: 1 byte(s)
Example size: 16 byte(s)
Example alignment: 8 byte(s)

=== Addresses ===
Object address: 0x7ffcc30b6c98
EmptyBase address: 0x7ffcc30b6c98
Example first data member address: 0x7ffcc30b6c98
Enter fullscreen mode Exit fullscreen mode

The object address, the base subobject address, and the first data member address of the derived structure point to the same address. The EmptyBase base class literally "dissolved" into its derived object without adding a single byte to its size. The magic lies in empty base optimization that has been applied by the compiler, considering that an empty base class doesn't contain any data and therefore does not require separate memory allocation for it. Since the parent has no data members, its existence in memory is just a formality for satisfying language rules.

Thus, when the derived class inherits from the empty base class, the compiler can place the base class inside the derived class by aligning its address with the first data member address of the derived class. This saves memory without violating alignment rules.

Multiple inheritance

No surprises here. We can use the knowledge we've already gained about inheriting POD and non-POD structures and classes, and align the derived class according to the inheritance hierarchy.

struct Base1
{
  Base1();

  long long a;
  int b;
};

struct Base2
{
  short c;
  char d;
};

struct Example : Base1, Base2
{
  char e;
};
Enter fullscreen mode Exit fullscreen mode

Full code fragment.
#include <iostream>
#include <cstdint>
#include <format>

struct Base1
{
  Base1();

  long long a;
  int b;
};

struct Base2
{
  char c;
  char d;
};

struct Example : Base1, Base2
{
  char e;
};

int main()
{
  Example obj;
  std::cout << "=== Multiple Inheritance ==="
            << std::endl
            << std::format("Base1 size: {} byte(s)",
                           sizeof(Base1))
            << std::endl
            << std::format("Base1 alignment: {} byte(s)",
                           alignof(Base1))
            << std::endl
            << std::format("Base2 size: {} byte(s)",
                           sizeof(Base2))
            << std::endl
            << std::format("Base2 alignment: {} byte(s)",
                           alignof(Base2))
            << std::endl
            << std::format("Example size: {} byte(s)",
                           sizeof(Example))
            << std::endl
            << std::format("Example alignment: {} byte(s)",
                           alignof(Example))
            << std::endl << std::endl
            << "=== Addresses ===" << std::endl
            << std::format("Base1 address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                            static_cast<Base1 *>(&obj)
                           ))
            << std::endl
            << std::format("Base2 address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                            static_cast<Base2 *>(&obj)
                           ))
            << std::endl
            << std::format("Base1 first member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.a))
            << std::endl
            << std::format("Base2 first member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.c))
            << std::endl
            << std::format("Example first member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&obj.e))
            << std::endl;
}
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with MSVC.
Compiler Explorer: https://godbolt.org/z/13f4fbdna
=== Multiple Inheritance ===
Base1 size: 16 byte(s)
Base1 alignment: 8 byte(s)
Base2 size: 2 byte(s)
Base2 alignment: 1 byte(s)
Example size: 24 byte(s)
Example alignment: 8 byte(s)

=== Addresses ===
Base1 address: 0xbaf12ffaa8
Base2 address: 0xbaf12ffab8
Base1 first member address: 0xbaf12ffaa8
Base2 first member address: 0xbaf12ffab8
Example first member address: 0xbaf12ffaba
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with Clang.
Compiler Explorer: https://godbolt.org/z/G5j1adsM3
=== Multiple Inheritance ===
Base1 size: 16 byte(s)
Base1 alignment: 8 byte(s)
Base2 size: 2 byte(s)
Base2 alignment: 1 byte(s)
Example size: 16 byte(s)
Example alignment: 8 byte(s)

=== Addresses ===
Base1 address: 0x7ffc4073bbe8
Base2 address: 0x7ffc4073bbf4
Base1 first member address: 0x7ffc4073bbe8
Base2 first member address: 0x7ffc4073bbf4
Example first member address: 0x7ffc4073bbf6
Enter fullscreen mode Exit fullscreen mode

MSVC places two basic structures using composition. The Example structure alignment will be as follows: 16 bytes of Base1 + 2 bytes of Base2 + 1 byte of the e data member + 5 bytes of final alignment = 24 bytes.

With Clang, the alignment is more interesting. When aligning the Example structure, the compiler handles Base1 first. Here's what we can say about that structure:

  • It's not a POD;
  • size without final alignment is dsize == 12;
  • total size is 16 bytes;
  • alignment is 8 bytes.

Next comes Base2:

  • it's a POD;
  • total size is 2 bytes;
  • alignment is 1 byte.

Aligning the previous base structure is more than the current one, so there's no need to add padding. The alignment of the Example structure will be as follows: 12 bytes of Base1 + 2 bytes of Base2 + 1 byte of the e data member + 1 byte of final alignment = 16 bytes.

As we can see in this example, Clang produces a more compact alignment compared to MSVC, thanks to the Itanium ABI rules.

Multiple inheritance of empty classes

Things get even more head-scratching with multiple inheritance of empty classes. Look at the example:

struct Empty1 {};
struct Empty2 {};
struct NonEmpty { short a; };
struct Empty3 {};
struct Empty4 {};

struct Example : Empty1, Empty2, NonEmpty, Empty3, Empty4
{
  char* ptr;
  long value; 
  short number;
  char symbol;
};
Enter fullscreen mode Exit fullscreen mode

Full code fragment.
#include <iostream>
#include <format>

struct Empty1 {};
struct Empty2 {};
struct NonEmpty { short a; };
struct Empty3 {};
struct Empty4 {};

struct Example : Empty1, Empty2, NonEmpty, Empty3, Empty4
{
  char* ptr;
  long value; 
  short number;
  char symbol;
};

int main()
{
  Example x;
  std::cout << "=== Multiple Empty Base Inheritance ==="
            << std::endl
            << std::format("Empty1 size: {} byte(s)",
                           sizeof(Empty1))
            << std::endl
            << std::format("Empty1 alignment: {} byte(s)",
                           alignof(Empty1))
            << std::endl
            << std::format("Empty2 size: {} byte(s)",
                           sizeof(Empty2))
            << std::endl
            << std::format("Empty2 alignment: {} byte(s)",
                           alignof(Empty2))
            << std::endl
            << std::format("NonEmpty size: {} byte(s)",
                           sizeof(NonEmpty))
            << std::endl
            << std::format("NonEmpty alignment: {} byte(s)",
                           alignof(NonEmpty))
            << std::endl
            << std::format("Empty3 size: {} byte(s)",
                           sizeof(Empty3))
            << std::endl
            << std::format("Empty3 alignment: {} byte(s)",
                           alignof(Empty3))
            << std::endl
            << std::format("Empty4 size: {} byte(s)",
                           sizeof(Empty4))
            << std::endl
            << std::format("Empty4 alignment: {} byte(s)",
                           alignof(Empty4))
            << std::endl
            << std::format("Example size: {} byte(s)",
                           sizeof(Example))
            << std::endl
            << std::format("Example alignment: {} byte(s)",
                           alignof(Example))
            << std::endl
            << "=== Addresses ===" << std::endl
            << std::format("Object address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&x))
            << std::endl
            << std::format("Empty1 address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<Empty1*>(&x)
                           ))
            << std::endl
            << std::format("Empty2 address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<Empty2*>(&x)
                           ))
            << std::endl
            << std::format("NonEmpty address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<NonEmpty*>(&x)
                           ))
            << std::endl
            << std::format("Empty3 address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<Empty3*>(&x)
                           ))
            << std::endl
            << std::format("Empty4 address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(
                             static_cast<Empty4*>(&x)
                           ))
            << std::endl
            << std::format("First member address: 0x{:x}",
                           reinterpret_cast<uintptr_t>(&x.ptr))
            << std::endl << std::endl;
}
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with MSVC.
Compiler Explorer: https://godbolt.org/z/xPT7qq33o
=== Multiple Empty Base Inheritance ===
Empty1 size: 1 byte(s)
Empty1 alignment: 1 byte(s)
Empty2 size: 1 byte(s)
Empty2 alignment: 1 byte(s)
NonEmpty size: 2 byte(s)
NonEmpty alignment: 2 byte(s)
Empty3 size: 1 byte(s)
Empty3 alignment: 1 byte(s)
Empty4 size: 1 byte(s)
Empty4 alignment: 1 byte(s)
Example size: 24 byte(s)
Example alignment: 8 byte(s)

=== Addresses ===
Object address: 0xa550cffb70
Empty1 address: 0xa550cffb70
Empty2 address: 0xa550cffb71
NonEmpty address: 0xa550cffb72
NonEmpty first member address: 0xa550cffb72
Empty3 address: 0xa550cffb74
Empty4 address: 0xa550cffb75
Example first member address: 0xa550cffb78
Enter fullscreen mode Exit fullscreen mode

Output from a program compiled with Clang.
Compiler Explorer: https://godbolt.org/z/nEqd4bqEY
=== Multiple Empty Base Inheritance ===
Empty1 size: 1 byte(s)
Empty1 alignment: 1 byte(s)
Empty2 size: 1 byte(s)
Empty2 alignment: 1 byte(s)
NonEmpty size: 2 byte(s)
NonEmpty alignment: 2 byte(s)
Empty3 size: 1 byte(s)
Empty3 alignment: 1 byte(s)
Empty4 size: 1 byte(s)
Empty4 alignment: 1 byte(s)
Example size: 32 byte(s)
Example alignment: 8 byte(s)

=== Addresses ===
Object address: 0x7ffdf04c06d0
Empty1 address: 0x7ffdf04c06d0
Empty2 address: 0x7ffdf04c06d0
NonEmpty address: 0x7ffdf04c06d0
NonEmpty first member address: 0x7ffdf04c06d0
Empty3 address: 0x7ffdf04c06d0
Empty4 address: 0x7ffdf04c06d0
Example first member address: 0x7ffdf04c06d8
Enter fullscreen mode Exit fullscreen mode

Note that MSVC optimizes only the first empty base class. At the same time, Clang placed four empty base classes at the very beginning of the object, despite their inheritance hierarchy. Why does MSVC behave this way? It's an odd tradition :)

Fortunately, the same behavior can be achieved for MSVC starting with Visual Studio 2015 Update 3, if we use the [__declspec(empty_bases)](https://learn.microsoft.com/en-us/cpp/cpp/empty-bases?view=msvc-170) attribute.

Conclusion

So, the mystery becomes deeper and darker, but it no longer frightens. Inheritance alignment isn't chaos, it's a precise mechanism, and now we know its rules and requirements. Yet, we understand how the class hierarchy turns into memory, and we can predict where the next "invisible enemy" will hide. However, the world of C++ is inexhaustible: virtual base classes and the intricacies of vtable optimization await their moment. To be continued... :)

To verify in practice that transitions between classes in the hierarchy don't create hidden memory issues, you can use PVS-Studio static analyzer. It helps you check object alignment correctness and becomes a reliable friend in building efficient hierarchies.

Top comments (0)