Why isn't sizeof for a struct equal to the sum of sizeof of each member?


Question

Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members?

1
629
9/23/2018 9:09:20 AM

Accepted Answer

This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:

  • Mis-aligned access might be a hard error (often SIGBUS).
  • Mis-aligned access might be a soft error.
    • Either corrected in hardware, for a modest performance-degradation.
    • Or corrected by emulation in software, for a severe performance-degradation.
    • In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.

Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):

struct X
{
    short s; /* 2 bytes */
             /* 2 padding bytes */
    int   i; /* 4 bytes */
    char  c; /* 1 byte */
             /* 3 padding bytes */
};

struct Y
{
    int   i; /* 4 bytes */
    char  c; /* 1 byte */
             /* 1 padding byte */
    short s; /* 2 bytes */
};

struct Z
{
    int   i; /* 4 bytes */
    short s; /* 2 bytes */
    char  c; /* 1 byte */
             /* 1 padding byte */
};

const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */

One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).

IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.

606
7/5/2017 8:25:02 PM

Packing and byte alignment, as described in the C FAQ here:

It's for alignment. Many processors can't access 2- and 4-byte quantities (e.g. ints and long ints) if they're crammed in every-which-way.

Suppose you have this structure:

struct {
    char a[3];
    short int b;
    long int c;
    char d[3];
};

Now, you might think that it ought to be possible to pack this structure into memory like this:

+-------+-------+-------+-------+
|           a           |   b   |
+-------+-------+-------+-------+
|   b   |           c           |
+-------+-------+-------+-------+
|   c   |           d           |
+-------+-------+-------+-------+

But it's much, much easier on the processor if the compiler arranges it like this:

+-------+-------+-------+
|           a           |
+-------+-------+-------+
|       b       |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           |
+-------+-------+-------+

In the packed version, notice how it's at least a little bit hard for you and me to see how the b and c fields wrap around? In a nutshell, it's hard for the processor, too. Therefore, most compilers will pad the structure (as if with extra, invisible fields) like this:

+-------+-------+-------+-------+
|           a           | pad1  |
+-------+-------+-------+-------+
|       b       |     pad2      |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           | pad3  |
+-------+-------+-------+-------+

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon