Gale Paeper wrote:
[snip]
Frank Heckenbach wrote:
Adriaan van Os wrote:
The failures in nonloc2goto.pas and nonloc4goto.pas go away when increasing the stacksize limit. We might reduce the recursion depth somewhat in these tests, in order to run them in the default Mac OS X stacksize limit of 512 KB.
I'm a bit surprised. Depth 10000 with only one integer parameter shouldn't take much stack -- 16 bytes per recursion on my system. Is the procedure call overhead (WRT stack usage) so big on the Mac?
[snip]
... I'll have to take a look a the disassembly for the two programs to know for sure, but the minimum stack frame possible for a recursive routine with one integer parameter is 32 bytes - the standard required 24 bytes for the linkage area plus 4 bytes for the parameter area rounded up to the next 16 byte boundary.
In looking at the actual generated PPC assembly code for the trash procedure in nonloc2goto.pas and the trash function in nonloc4goto.pas, both end up using an 80 byte stack frame.
I'm at a loss to explain why the stack frame is so large. There is nothing in the assembly code for those two routines which would require that large of a stack frame but that's what the compiler generates.
To see whether it is a Pascal front end issue or a generic PPC backend issue, I wrote C equivalents for those to routines. For the C analogue to the trash procedure, I got a tail call optimization which used no stack frame at all. For the C analogue to the trash function, the generated code is fairly similar to the code generated for the Pascal function and the C analogue also ended up using an 80 byte stack frame (for no discernable reason either).
So, contrary to the documented requirements for the Mac OS X Mach-O runtime PowerPC stack structure, the back end seems to be forcing an 80 byte minimum stack frame size for some unknown reason when a routine uses a stack frame.
Note: The assembly code was generated using the compiler in Adriaan's currently distributed, pre-build Mac OS X GPC package. The compiler specs are:
Reading specs from /Developer/Pascal/gpc332d1/lib/gcc-lib/powerpc-apple-darwin/3.3.2/specs Configured with: ../gpc-332d1/configure --enable-languages=pascal,c --prefix=/Developer/Pascal/gpc332d1 --enable-threads=posix --target=powerpc-apple-darwin Thread model: posix gpc version 20030830, based on gcc-3.3.2
The command line options used for all assembly code generations was:
gpc -S -O3 --automake
P.S. Frank and/or Waldek, it would be interesting to know the reason why the Pascal code from nonloc2goto.pas:
procedure trash(n: integer); begin if n>0 then trash(n-1) end;
doesn't get tail call optimized; whereas, the C analogue:
void trash(int n){
if (n>0) { trash(n-1); } }
does get tail call optimized. (Note: I'm not trying to imply that you folks should expend any effort in getting GPC to tail optimize what is in essense artificial test code. I'm just curious as to the reason behind the difference in optimizations applied between the C and Pascal versions.)
Gale Paeper gpaeper@empirenet.com