On 08 Sep 2006, at 18:30, Waldek Hebisch wrote:
Jonas Maebe wrote:
On 8 sep 2006, at 16:57, Waldek Hebisch wrote:
Is there a particular reason for this? This means that e.g. a byte and a "packed array[0..7] of boolean" have a different size (except on 16 bit systems).
This is just artifact of the implementation. Namely, all acceses must be properly aligned (otherwise we get wrong code or crashes on some platforms (ARM, Sparc)).
Yes, that's why FPC also always aligns packed arrays to a the access size.
gpc choose half of the wordsize as an access size and always fetches two consecutive halfwords. I guess that fixed size is used for simplicity of implementation and half of the wordsize to avoid double shifts.
Currently already uses double shifts along with several conditional jumps when using a variable index, even if you're accessing a packed array of boolean (or equivalent). The code generated for this:
type ta = 0..1; tb = packed array[0..999] of ta; tc = array[0..124] of byte; const results: array[0..9] of ta = (1,0,1,1,1,0,1,1,1,0); var a: ta; b: tb; i,j: integer; begin ... for i := low(results) to high(results) do if b[i] <> results[i] then error(7); end;
is this on x86 (with -O2, note that it moved around several blocks, but only the code under L48 is part of the loop control; the rest is all for the if-test and calling the error procedure):
.L85: movl -156(%ebp), %ecx andl $15, %ecx .L81: shrdl %edi, %esi shrl %cl, %edi testb $32, %cl je .L52 movl %edi, %esi .L52: movl -156(%ebp), %ebx movl %esi, %eax andl $1, %eax cmpl static_Results_0(,%ebx,4), %eax je .L48 movl $7, (%esp) call _p__M0_S0_Error .L48: cmpl $8, -156(%ebp) jg .L47 incl -156(%ebp) .L46: movl -156(%ebp), %esi xorl %edx, %edx xorl %ebx, %ebx movl %ebx, %edi sarl $4, %esi movzwl -150(%ebp,%esi,2), %eax movzwl -152(%ebp,%esi,2), %ecx shldl $16, %eax, %edx movl %ecx, %esi sall $16, %eax orl %eax, %esi movl -156(%ebp), %eax orl %edx, %edi testl %eax, %eax jns .L85 movl -156(%ebp), %eax negl %eax andl $15, %eax je .L52 movl $16, %ecx subl %eax, %ecx jmp .L81
On ppc it's even a lot worse, because there a double precision shift requires a call to a helper routine.
This implementation has a number of problems. My plan was to change it. I would like to keep fixed access size -- IMHO pure byte access is the most natural one. Access to the last byte must be conditional.
So you mean always alignment 1 and rounding up to byte size?
There is always a compromise between space (forcing bigger alignment) and number of acceses needed to fetch given value -- I do not think that we must limit number of access to two.
I thought exactly that option that was a nice a compromise between speed and space though. Especially since when having bigger elements, the rounding up of the size can be easily made more coarse grained without significantly impacting the total size of the array.
Jonas