[ale] linux byte alignment
Joe Steele
joe at madewell.com
Sun Aug 4 16:46:36 EDT 2002
What you have seen is that gcc is maintaining the stack alignment
at a 16-byte boundary. This is to prevent potential problems when
using the Streaming SIMD (single instruction multiple data)
Extensions (SSE) first introduced on Pentium III CPUs. These
extensions added some new instructions and a set of 8 128-bit
registers (XMM0 to XMM7) to the CPU. Some of the new instructions
will cause a general protection fault when moving 128-bit data to or
from memory which is not on a 16-byte boundary. While there are
alternative instructions which can access 128-bit data on arbitrary
boundaries, they are less efficient.
In the example you gave, the stack had the following items on it:
4 bytes: return address
4 bytes: old frame pointer
12 bytes: x[10]
8 bytes: y[5]
That's a total of 28 bytes, requiring an additional 4 bytes of
padding for alignment to the nearest 16-byte boundary. Therefore, to
make room for x[10], y[5], plus padding, the stack pointer must be
decremented by 12 + 8 + 4 = 24 bytes.
In your example however, the stack pointer is decremented by 40
rather than 24. The result is still properly aligned, but an
additional 16 bytes of space is wasted. This is just a peculiarity
of the compiler version you are using. I think you will find this
waste is eliminated in the latest versions of gcc.
Incidentally, you can also observe stack alignment being performed
when setting up function calls. Some example code:
extern int foo (char bar);
int foobar (void)
{
return foo (0);
}
In the resulting assembly code, you see %esp reduced by 8 which
maintains stack alignment on function entry. Then you see %esp
reduced an additional 12 before pushing 4 more bytes on the stack and
calling foo(), again maintaining alignment. On return from the
function call, 16 bytes are removed from the stack:
foobar:
pushl %ebp
movl %esp,%ebp
subl $8,%esp
addl $-12,%esp
pushl $0
call foo
addl $16,%esp
movl %eax,%edx
movl %edx,%eax
jmp .L2
.p2align 4,,7
.L2:
leave
ret
All of this alignment behavior is controlled by the
"-mpreferred-stack-boundary=" compiler option. The default is for
16-byte alignment, but the default can be overridden.
--Joe
-----Original Message-----
From: Benjamin Dixon [SMTP:beatle at arches.uga.edu]
Sent: Monday, July 29, 2002 3:26 PM
To: ale at ale.org
Subject: [ale] linux byte alignment
Hi all,
I'm trying to pry into linux byte alignment issues and assembly and I ran
across something I haven't figured out. My understanding is that alignment
is at one word (4 bytes) so I have the following function:
int main()
{
char x[10];
char y[5];
}
By my calculation, if the memory has to be alignment, x[10] will take up
12 bytes (ceiling of 2.5 words = 3, 3x4-bytes = 12). And likewise, the
y[5] will take up 8 bytes. So there's 20 bytes of excess memory laying
around? But when I run the program through gcc with the -S option, I get
the following:
..
main:
pushl %ebp
movl %esp,%ebp
subl $40,%esp
.L2:
movl %ebp,%esp
popl %ebp
ret
.Lfe1:
..
The question is, what's that 40? If I use different numbers for the array
sizes, I get a different number there, always divisible by 4 but always
greater than the number I expect. Anyone know why?
Ben
---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
sent to listmaster at ale dot org.
More information about the Ale
mailing list