eric_oldre wrote:Would you expect much of a difference for these two?
Eric
- Code: Select all
int funcA(int i){
char myvar[10];
//do stuff
}
int funcB(int i){
char myvar[2000];
//do stuff
}
On modern CPU's there should hardly be any difference at all. The address of the active stack frame is kept in a register (the 'base pointer' %ebp in Intel Architecture), and all local variables are addressed through an offset to that register. If the offset is smaller than
+ 128 it can be encoded in a single byte in the instruction stream, otherwise 4 bytes are used. The execution times are exactly the same, the CPU has 2 or 3 specialized adders (Address Generation Units) to add the offset to the base pointer to calculate the actual address and always does this in a single clock cycle without taxing other resources (execution units) in the CPU, no matter how many bits there are in the offset. The code is more compact with the small offsets, though, which could rsult in a larger hit-rate of the instruction cache in cases where the entire code would fit in the I-cache with the small offsets, and would not fit in case of the long offsets.
In the case you give, where the local variable is an array, even that distinction disappears: the larger offset in that case purely comes from the index being large, and the index will not be encoded in the instruction but will be the content of some other register (which can always hold 32 bit). So obtaining the addres amounts to adding a small (1-byte) offset and two registers (one of which the base pointer). The AGUs can also do that in a single CPU clock, in parallel with other stuff.
So in the end the only thing that has a strong impact is the total amount of variable space that you use, if that fits in the level-1 data cache or not. If you have a D-cache of 16KB, and each instance of the recursively called routine uses 2KB, you run out of cache after 8 ply, while with only 200 bytes of local variables you could go to 80 ply... (Typically there is an overhead of 8 bytes on top of the variables you declare explicitly, to store the return address and the previous base pointer, while parameters simply count as local variables that happen to be initialized by the caller. The compiler might add a few temporary storage locations as well.)