Hello,
What is the size of this packed array in gpc on 64 bit systems:
type ta = 0..$7fffffff; tb = packed array[0..799] of ta;
Is it 3100 or 3104 bytes?
On a related note, can someone explain me how to build gpc for Mac OS X/ppc64? powerpc64-apple-darwin8 doesn't seem to be a supported target, but maybe it's called differently?
Thanks,
Jonas
Hello,
What is the size of this packed array in gpc on 64 bit systems:
type ta = 0..$7fffffff; tb = packed array[0..799] of ta;
Is it 3100 or 3104 bytes?
It is 3100. The size is rounded up to the next multiple of half of the wordsize of the machine. On 64 bit machines wordsize is 8 bytes, so the size is rounded up to the multiple of 4-bytes. Since 3100 is a multiple of 4 the roundig up is a no-op here. BTW: when accesing the last element the current implementation will touch the 4-byte word beyond the end of the array (it is a bug ...).
On a related note, can someone explain me how to build gpc for Mac OS X/ppc64? powerpc64-apple-darwin8 doesn't seem to be a supported target, but maybe it's called differently?
You should already have 64 bit compiler proper (at least when using sufficiently new backend) -- try what `-print-multi-lib' gives you. The problem is that ATM gpc runtime is build only in one version (32 bit one) -- one have to build 64 bit runtime by hand. Peter Keenan described what to do for 64 bit AIX:
http://www.gnu-pascal.de/crystal/gpc/en/mail11264.html
With obvious changes this should apply to OS/X.
On 8 sep 2006, at 16:57, Waldek Hebisch wrote:
What is the size of this packed array in gpc on 64 bit systems:
type ta = 0..$7fffffff; tb = packed array[0..799] of ta;
Is it 3100 or 3104 bytes?
It is 3100. The size is rounded up to the next multiple of half of the wordsize of the machine.
Is there a particular reason for this? This means that e.g. a byte and a "packed array[0..7] of boolean" have a different size (except on 16 bit systems).
On 64 bit machines wordsize is 8 bytes, so the size is rounded up to the multiple of 4-bytes. Since 3100 is a multiple of 4 the roundig up is a no-op here. BTW: when accesing the last element the current implementation will touch the 4-byte word beyond the end of the array (it is a bug ...).
Will this fixed by rounding up the size to a multiple of 8 bytes, or by turning all accesses into maximally half word accesses? I'd like to keep packed array sizes the same between fpc and gpc. Currently fpc rounds the size up to the access size, which is calculated as follows:
function packedbitsloadsize(bitlen: int64) : int64; begin case bitlen of 1,2,4,8: result := 1; { 10 bits can never be split over 3 bytes via 1-8-1, because it } { always starts at a multiple of 10 bits. Same for the others. } 3,5,7,9,10,12,16: result := 2; {$ifdef cpu64bit} 11,13,14,15,17..26,28,32: result := 4; else result := 8; {$else cpu64bit} else result := 4; {$endif cpu64bit} end; end;
It basically chooses the smallest size possible which can encompass the entire value (although of course it sometimes still needs to be split into two accesses in case of access size boundary crossing). For my tests (on 32 bit systems) this corresponded with gpc sizes, but I didn't test exhaustively so was just coincidence I guess.
Fortunately the bit layout is independent of the access size, so at least on that point there are no incompatibilities.
On a related note, can someone explain me how to build gpc for Mac OS X/ppc64? powerpc64-apple-darwin8 doesn't seem to be a supported target, but maybe it's called differently?
You should already have 64 bit compiler proper (at least when using sufficiently new backend) -- try what `-print-multi-lib' gives you.
bigmac:~ jonas$ gpc -v Reading specs from /Developer/Pascal/gpc345u2/lib/gcc/powerpc-apple- darwin8/3.4.5/specs Configured with: ../gcc-3.4.5/configure --enable-languages=pascal,c -- enable-threads=posix --target=powerpc-apple-darwin8 --host=powerpc- apple-darwin8 --build=powerpc-apple-darwin8 --prefix=/Developer/ Pascal/gpc345u2 Thread model: posix gpc version 20051116, based on gcc-3.4.5 bigmac:~ jonas$ gpc -print-multi-lib .;
(this is Adriaan's Mac OS X distribution)
And at least "gpc -m64" doesn't work:
bigmac:~/fpc/test jonas$ gpc -m64 tt.pp gpc1: error: invalid option `64'
The problem is that ATM gpc runtime is build only in one version (32 bit one) -- one have to build 64 bit runtime by hand. Peter Keenan described what to do for 64 bit AIX:
http://www.gnu-pascal.de/crystal/gpc/en/mail11264.html
With obvious changes this should apply to OS/X.
I think I don't have a 64 bit gpc.
Jonas
Jonas Maebe wrote:
On 8 sep 2006, at 16:57, Waldek Hebisch wrote:
What is the size of this packed array in gpc on 64 bit systems:
type ta = 0..$7fffffff; tb = packed array[0..799] of ta;
Is it 3100 or 3104 bytes?
It is 3100. The size is rounded up to the next multiple of half of the wordsize of the machine.
Is there a particular reason for this? This means that e.g. a byte and a "packed array[0..7] of boolean" have a different size (except on 16 bit systems).
This is just artifact of the implementation. Namely, all acceses must be properly aligned (otherwise we get wrong code or crashes on some platforms (ARM, Sparc)). gpc choose half of the wordsize as an access size and always fetches two consecutive halfwords. I guess that fixed size is used for simplicity of implementation and half of the wordsize to avoid double shifts.
This implementation has a number of problems. My plan was to change it. I would like to keep fixed access size -- IMHO pure byte access is the most natural one. Access to the last byte must be conditional.
There is always a compromise between space (forcing bigger alignment) and number of acceses needed to fetch given value -- I do not think that we must limit number of access to two.
On 64 bit machines wordsize is 8 bytes, so the size is rounded up to the multiple of 4-bytes. Since 3100 is a multiple of 4 the roundig up is a no-op here. BTW: when accesing the last element the current implementation will touch the 4-byte word beyond the end of the array (it is a bug ...).
Will this fixed by rounding up the size to a multiple of 8 bytes, or by turning all accesses into maximally half word accesses? I'd like to keep packed array sizes the same between fpc and gpc. Currently fpc rounds the size up to the access size, which is calculated as follows:
function packedbitsloadsize(bitlen: int64) : int64; begin case bitlen of 1,2,4,8: result := 1; { 10 bits can never be split over 3 bytes via 1-8-1,
because it } { always starts at a multiple of 10 bits. Same for the others. } 3,5,7,9,10,12,16: result := 2; {$ifdef cpu64bit} 11,13,14,15,17..26,28,32: result := 4; else result := 8; {$else cpu64bit} else result := 4; {$endif cpu64bit} end; end;
It basically chooses the smallest size possible which can encompass the entire value (although of course it sometimes still needs to be split into two accesses in case of access size boundary crossing). For my tests (on 32 bit systems) this corresponded with gpc sizes, but I didn't test exhaustively so was just coincidence I guess.
Yes, gpc uses the same access size for all bit lenths. There is always a compromise between space (forcing bigger alignment) and number of acceses needed to fetch given value.
On a related note, can someone explain me how to build gpc for Mac OS X/ppc64? powerpc64-apple-darwin8 doesn't seem to be a supported target, but maybe it's called differently?
You should already have 64 bit compiler proper (at least when using sufficiently new backend) -- try what `-print-multi-lib' gives you.
bigmac:~ jonas$ gpc -v Reading specs from /Developer/Pascal/gpc345u2/lib/gcc/powerpc-apple- darwin8/3.4.5/specs Configured with: ../gcc-3.4.5/configure --enable-languages=pascal,c -- enable-threads=posix --target=powerpc-apple-darwin8 --host=powerpc- apple-darwin8 --build=powerpc-apple-darwin8 --prefix=/Developer/ Pascal/gpc345u2 Thread model: posix gpc version 20051116, based on gcc-3.4.5 bigmac:~ jonas$ gpc -print-multi-lib .;
(this is Adriaan's Mac OS X distribution)
And at least "gpc -m64" doesn't work:
bigmac:~/fpc/test jonas$ gpc -m64 tt.pp gpc1: error: invalid option `64'
The problem is that ATM gpc runtime is build only in one version (32 bit one) -- one have to build 64 bit runtime by hand. Peter Keenan described what to do for 64 bit AIX:
http://www.gnu-pascal.de/crystal/gpc/en/mail11264.html
With obvious changes this should apply to OS/X.
I think I don't have a 64 bit gpc.
Yes, probably 3.4.x backend supports only 32 bit programs on OS/X. AFAICS 4.0.x and later backends have 64-bit support.
Waldek Hebisch wrote:
On a related note, can someone explain me how to build gpc for Mac OS X/ppc64? powerpc64-apple-darwin8 doesn't seem to be a supported target, but maybe it's called differently?
You should already have 64 bit compiler proper (at least when using sufficiently new backend) -- try what `-print-multi-lib' gives you. The problem is that ATM gpc runtime is build only in one version (32 bit one) -- one have to build 64 bit runtime by hand. Peter Keenan described what to do for 64 bit AIX:
http://www.gnu-pascal.de/crystal/gpc/en/mail11264.html
With obvious changes this should apply to OS/X.
You may want to have a look at this thread http://www.gnu-pascal.de/crystal/gpc/en/thread13468.html in the mailing list archives. There are two main points
* for Mac OS X, you need a gcc-4 back-end (which gpc supports preliminary) * you have to manually build a 64-bit libgpc.a and combine it with the 32-bit libgpc.a, using lipo * then, you can use -m64 to create a 64-bit executable.
I will be pleased to help with problems that may arise (I have built for 64-bit powerpc, not yet for 64-bit x86).
Regards,
Adriaan van Os
On 08 Sep 2006, at 18:30, Waldek Hebisch wrote:
Jonas Maebe wrote:
On 8 sep 2006, at 16:57, Waldek Hebisch wrote:
Is there a particular reason for this? This means that e.g. a byte and a "packed array[0..7] of boolean" have a different size (except on 16 bit systems).
This is just artifact of the implementation. Namely, all acceses must be properly aligned (otherwise we get wrong code or crashes on some platforms (ARM, Sparc)).
Yes, that's why FPC also always aligns packed arrays to a the access size.
gpc choose half of the wordsize as an access size and always fetches two consecutive halfwords. I guess that fixed size is used for simplicity of implementation and half of the wordsize to avoid double shifts.
Currently already uses double shifts along with several conditional jumps when using a variable index, even if you're accessing a packed array of boolean (or equivalent). The code generated for this:
type ta = 0..1; tb = packed array[0..999] of ta; tc = array[0..124] of byte; const results: array[0..9] of ta = (1,0,1,1,1,0,1,1,1,0); var a: ta; b: tb; i,j: integer; begin ... for i := low(results) to high(results) do if b[i] <> results[i] then error(7); end;
is this on x86 (with -O2, note that it moved around several blocks, but only the code under L48 is part of the loop control; the rest is all for the if-test and calling the error procedure):
.L85: movl -156(%ebp), %ecx andl $15, %ecx .L81: shrdl %edi, %esi shrl %cl, %edi testb $32, %cl je .L52 movl %edi, %esi .L52: movl -156(%ebp), %ebx movl %esi, %eax andl $1, %eax cmpl static_Results_0(,%ebx,4), %eax je .L48 movl $7, (%esp) call _p__M0_S0_Error .L48: cmpl $8, -156(%ebp) jg .L47 incl -156(%ebp) .L46: movl -156(%ebp), %esi xorl %edx, %edx xorl %ebx, %ebx movl %ebx, %edi sarl $4, %esi movzwl -150(%ebp,%esi,2), %eax movzwl -152(%ebp,%esi,2), %ecx shldl $16, %eax, %edx movl %ecx, %esi sall $16, %eax orl %eax, %esi movl -156(%ebp), %eax orl %edx, %edi testl %eax, %eax jns .L85 movl -156(%ebp), %eax negl %eax andl $15, %eax je .L52 movl $16, %ecx subl %eax, %ecx jmp .L81
On ppc it's even a lot worse, because there a double precision shift requires a call to a helper routine.
This implementation has a number of problems. My plan was to change it. I would like to keep fixed access size -- IMHO pure byte access is the most natural one. Access to the last byte must be conditional.
So you mean always alignment 1 and rounding up to byte size?
There is always a compromise between space (forcing bigger alignment) and number of acceses needed to fetch given value -- I do not think that we must limit number of access to two.
I thought exactly that option that was a nice a compromise between speed and space though. Especially since when having bigger elements, the rounding up of the size can be easily made more coarse grained without significantly impacting the total size of the array.
Jonas
Jonas Maebe wrote:
On 08 Sep 2006, at 18:30, Waldek Hebisch wrote:
gpc choose half of the wordsize as an access size and always fetches two consecutive halfwords. I guess that fixed size is used for simplicity of implementation and half of the wordsize to avoid double shifts.
Currently already uses double shifts along with several conditional jumps when using a variable index, even if you're accessing a packed array of boolean (or equivalent). The code generated for this:
Well, the current scheme was introduced many years ago. I do not say that the goals were attained. I am just trying to reconstruct the rationale.
This implementation has a number of problems. My plan was to change it. I would like to keep fixed access size -- IMHO pure byte access is the most natural one. Access to the last byte must be conditional.
So you mean always alignment 1 and rounding up to byte size?
Yes.
There is always a compromise between space (forcing bigger alignment) and number of acceses needed to fetch given value -- I do not think that we must limit number of access to two.
I thought exactly that option that was a nice a compromise between speed and space though. Especially since when having bigger elements, the rounding up of the size can be easily made more coarse grained without significantly impacting the total size of the array.
Yes. However when packing 63 bit elements on 32 bit machine (or 127 bit elements on 64 bit machine) using two acceses causes trouble -- gpc implementation would need quadruple precision shifts.
Also, I would like to have possibility to implement packed array access via call to runtime support -- using different sizes requires more support code.
I must say that my idea to use byte access is mainly motivated by simplicity of implementation. Using two acceses and varying access size is probably better, but requires (slightly) more work. gpc had a number of bugs in its packed array implementation, so I tried to find out the simplest correct way...
On 09 Sep 2006, at 23:25, Waldek Hebisch wrote:
Yes. However when packing 63 bit elements on 32 bit machine (or 127 bit elements on 64 bit machine) using two acceses causes trouble -- gpc implementation would need quadruple precision shifts.
I admit I did not consider that since FPC only supports subranges up to the native word size (so larger ordinal types will never be packed).
Also, I would like to have possibility to implement packed array access via call to runtime support -- using different sizes requires more support code.
It doesn't, fortunately. The way the data is stored in memory means that you can access it using any size you want and still get the same results. FPC and GPC currently use different access sizes, yet the data has the same memory layout for both compilers (on both big and little endian systems).
I must say that my idea to use byte access is mainly motivated by simplicity of implementation. Using two acceses and varying access size is probably better, but requires (slightly) more work. gpc had a number of bugs in its packed array implementation, so I tried to find out the simplest correct way...
The issue is not really the access size as explained above, but the (currently consequent) size of the entire packed array (as the subject indicates :) One solution might be to decouple them, although it's of course space-wasting to round up the size of a packed array to a multiple of 4 bytes if you're only going to use byte accesses anyway.
Jonas
Jonas Maebe wrote:
On 09 Sep 2006, at 23:25, Waldek Hebisch wrote:
Yes. However when packing 63 bit elements on 32 bit machine (or 127 bit elements on 64 bit machine) using two acceses causes trouble -- gpc implementation would need quadruple precision shifts.
I admit I did not consider that since FPC only supports subranges up to the native word size (so larger ordinal types will never be packed).
I do not know if I really want to bit-pack subranges bigger then word size. But currently gpc packs them (the results are bogus...).
Also, I would like to have possibility to implement packed array access via call to runtime support -- using different sizes requires more support code.
It doesn't, fortunately. The way the data is stored in memory means that you can access it using any size you want and still get the same results. FPC and GPC currently use different access sizes, yet the data has the same memory layout for both compilers (on both big and little endian systems).
For gpc this is more complicated: gpc generates C-like code which then goes trough gcc optimizers. Normally optimizers assume that memory is accessed only trough pointer of one type (hence one size), so there there is extra work involved to avoid mis-optimization. ATM for runtime calls this problem is only theoretical, but currently gcc developers work on doing global optimization (spanning translation units) and the problem may quickly become real.
Also, using different access size risks processor stalls on write buffers -- I am not sure if we care about such stalls when optimizing for space, but I would prefer to avoid them.
I must say that my idea to use byte access is mainly motivated by simplicity of implementation. Using two acceses and varying access size is probably better, but requires (slightly) more work. gpc had a number of bugs in its packed array implementation, so I tried to find out the simplest correct way...
The issue is not really the access size as explained above, but the (currently consequent) size of the entire packed array (as the subject indicates :) One solution might be to decouple them, although it's of course space-wasting to round up the size of a packed array to a multiple of 4 bytes if you're only going to use byte accesses anyway.
Size and alignment. The following record:
type pr = packed record b : boolean; a : packed array [0..1] of 0..1000000 end;
naively should take 7 bytes, but with word access it will need 12 bytes. And program which interprets record on disk must know alignment to find the data.
Since the memory usage grows both due to size and alignment I am reluctant to use layout allowing bigger access size without special need.
On 10 Sep 2006, at 01:24, Waldek Hebisch wrote:
For gpc this is more complicated: gpc generates C-like code which then goes trough gcc optimizers. Normally optimizers assume that memory is accessed only trough pointer of one type (hence one size), so there there is extra work involved to avoid mis- optimization.
That's only if you use -fstrict-aliasing afaik, and is only an issue here if a packed array is typecasted to different packed array types which use a different access size. Besides, users can also typecast it to a regular array (e.g. of integer) with a different access size, so you have to take this into account anyway.
Further, accessing memory with different sizes can happen in a lot of different ways (e.g. someone typecasts one array type to another, one record type into another, or even simply a variant record with two or more overlapping array fields). Did you have to add special code to avoid gcc mis-optimizing all those cases?
Also, using different access size risks processor stalls on write buffers -- I am not sure if we care about such stalls when optimizing for space, but I would prefer to avoid them.
Such stalls only occur if you are accessing the same data using different access sizes, right? I don't think this is an issue, because a) GPC will always use the same access size for a particular kind of packed array, and so will FPC b) when typecasting one kind of packed array to another you may get different access sizes, but then again that will also happen if you typecast it to something else than a packed array (e.g. an array of integer or so).
The issue is not really the access size as explained above, but the (currently consequent) size of the entire packed array (as the subject indicates :) One solution might be to decouple them, although it's of course space-wasting to round up the size of a packed array to a multiple of 4 bytes if you're only going to use byte accesses anyway.
Size and alignment. The following record:
type pr = packed record b : boolean; a : packed array [0..1] of 0..1000000 end;
naively should take 7 bytes, but with word access it will need 12 bytes. And program which interprets record on disk must know alignment to find the data.
I think packed records are a special case: they should always result in minimal alignment (that's the whole point of using packed records). So I would not align that array either, and FPC will also use byte-sized accesses there (not yet implemented) and an alignment of one (already the case currently).
A special case of also having the size rounded up to the next byte is technically a little more difficult (in case it would normally be a multiple of 4 bytes), but doable if considered really necessary.
Jonas
On 10 Sep 2006, at 01:24, Waldek Hebisch wrote:
For gpc this is more complicated: gpc generates C-like code which then goes trough gcc optimizers. Normally optimizers assume that memory is accessed only trough pointer of one type (hence one size), so there there is extra work involved to avoid mis- optimization.
That's only if you use -fstrict-aliasing afaik, and is only an issue here if a packed array is typecasted to different packed array types which use a different access size. Besides, users can also typecast it to a regular array (e.g. of integer) with a different access size, so you have to take this into account anyway.
Further, accessing memory with different sizes can happen in a lot of different ways (e.g. someone typecasts one array type to another, one record type into another, or even simply a variant record with two or more overlapping array fields). Did you have to add special code to avoid gcc mis-optimizing all those cases?
ATM gpc simply uses low-level equivalent of -fnostrict-aliasing (strict aliasing is a default). But AFAIK aliasing optimizations have significant effect (10-20% on SPEC), so I want to turn them on. Yes, I will need special code to avoid mis-optimizing casts (or declare them illegal, like C folks did).
Also, using different access size risks processor stalls on write buffers -- I am not sure if we care about such stalls when optimizing for space, but I would prefer to avoid them.
Such stalls only occur if you are accessing the same data using different access sizes, right? I don't think this is an issue, because a) GPC will always use the same access size for a particular kind of packed array, and so will FPC
My idea was to use runtime calls when optimizing for space and inline access when optimizing for speed. Different translation units may use different optimization options -- but access the same packed array.
So: there is no problem having different access size for different arrays, but there is a problem when runtime support uses different access size then the inline code.
The issue is not really the access size as explained above, but the (currently consequent) size of the entire packed array (as the subject indicates :) One solution might be to decouple them, although it's of course space-wasting to round up the size of a packed array to a multiple of 4 bytes if you're only going to use byte accesses anyway.
Size and alignment. The following record:
type pr = packed record b : boolean; a : packed array [0..1] of 0..1000000 end;
naively should take 7 bytes, but with word access it will need 12 bytes. And program which interprets record on disk must know alignment to find the data.
I think packed records are a special case: they should always result in minimal alignment (that's the whole point of using packed records). So I would not align that array either, and FPC will also use byte-sized accesses there (not yet implemented) and an alignment of one (already the case currently).
A special case of also having the size rounded up to the next byte is technically a little more difficult (in case it would normally be a multiple of 4 bytes), but doable if considered really necessary.
Well, ATM gpc uses normal alignment for some types which we consider too troublesome to pack (for example files ans schema). We use byte alignment for packed arrays inside packed records, but to generate correct code we should either use full alignment or byte access.
I agree that special casing packed arrays inside packed records is better, just requires more effort to handle correctly.
Jonas Maebe wrote:
On 10 Sep 2006, at 01:24, Waldek Hebisch wrote:
For gpc this is more complicated: gpc generates C-like code which then goes trough gcc optimizers. Normally optimizers assume that memory is accessed only trough pointer of one type (hence one size), so there there is extra work involved to avoid mis- optimization.
That's only if you use -fstrict-aliasing afaik, and is only an issue here if a packed array is typecasted to different packed array types which use a different access size. Besides, users can also typecast it to a regular array (e.g. of integer) with a different access size, so you have to take this into account anyway.
Further, accessing memory with different sizes can happen in a lot of different ways (e.g. someone typecasts one array type to another, one record type into another, or even simply a variant record with two or more overlapping array fields). Did you have to add special code to avoid gcc mis-optimizing all those cases?
Also, using different access size risks processor stalls on write buffers -- I am not sure if we care about such stalls when optimizing for space, but I would prefer to avoid them.
Such stalls only occur if you are accessing the same data using different access sizes, right? I don't think this is an issue, because a) GPC will always use the same access size for a particular kind of packed array, and so will FPC b) when typecasting one kind of packed array to another you may get different access sizes, but then again that will also happen if you typecast it to something else than a packed array (e.g. an array of integer or so).
The issue is not really the access size as explained above, but the (currently consequent) size of the entire packed array (as the subject indicates :) One solution might be to decouple them, although it's of course space-wasting to round up the size of a packed array to a multiple of 4 bytes if you're only going to use byte accesses anyway.
Size and alignment. The following record:
type pr = packed record b : boolean; a : packed array [0..1] of 0..1000000 end;
naively should take 7 bytes, but with word access it will need 12 bytes. And program which interprets record on disk must know alignment to find the data.
I think packed records are a special case: they should always result in minimal alignment (that's the whole point of using packed records). So I would not align that array either, and FPC will also use byte-sized accesses there (not yet implemented) and an alignment of one (already the case currently).
A special case of also having the size rounded up to the next byte is technically a little more difficult (in case it would normally be a multiple of 4 bytes), but doable if considered really necessary.
I don't understand why there is a problem. Packed array components are not directly accessible in any case, they have to be processed through the standard procedures pack and unpack anyhow. Those functions should handle any peculiarities. In addition, the 'packed' attribute does not have to do any actual packing, it only indicates a willingness to have the variable packed, and thus to need pack/unpack processing.
Somebody removed attributes, so the portion marked with >>> above comes from some unknown contributor. Please don't remove attributes for quoted material.
On 10 Sep 2006, at 13:11, CBFalconer wrote:
I don't understand why there is a problem. Packed array components are not directly accessible in any case, they have to be processed through the standard procedures pack and unpack anyhow.
Packed array components are directly accessible by indexing the packed array like a regular array (at least in both GPC and FPC). And I think it's even required for ISO Pascal since otherwise you would not be able to access the individual characters of a string by indexing strings, as I understand it (since their contents consists of a packed array of char).
Jonas
Waldek Hebisch wrote:
Further, accessing memory with different sizes can happen in a lot of different ways (e.g. someone typecasts one array type to another, one record type into another, or even simply a variant record with two or more overlapping array fields).
The latter is explicitly forbidden in ISO Pascal. The fact that some compilers (including GPC, ATM) don't check for such errors may be considered a bug (or omission) rather than a feature ...
IME, almost all cases of such type-casting are either to or from an unstructured block of memory (untyped pointer, array of Byte, etc.). AFAIK, C's aliasing rules explicitly allow for such cases (i.e., casting to "[unsigned] char *" in C terms).
But AFAIK aliasing optimizations have significant effect (10-20% on SPEC), so I want to turn them on. Yes, I will need special code to avoid mis-optimizing casts (or declare them illegal, like C folks did).
I also think such optimizations can be worth adding some special code. Perhaps we could make an option to turn strict aliasing on (probably with a special case for "unstructured" memory, as above, which the backend may already do automatically, because of C), so programmers could use it and care about their special cases (if any) explicitly. Of course, we'd have to check the RTS etc. for such special cases (which are more likely to be there than in average code), so we probably can't just turn it on now.
Well, the current scheme [half-word acccess] was introduced many years ago. I do not say that the goals were attained. I am just trying to reconstruct the rationale.
It was even before I started working on GPC, so that's also just AFAIK, but I think it was intended just as a temporary Q&D solution (obviously not taking into account element sizes) until the backend would support packed arrays like it does packed records. This hasn't happened (and it seems it never will, or do you have other information, Waldek?), so we still have the Q&D way.
Yes. However when packing 63 bit elements on 32 bit machine (or 127 bit elements on 64 bit machine) using two acceses causes trouble -- gpc implementation would need quadruple precision shifts.
I think it could also be done with rotations, but it would probably require a bit of code ...
I do not know if I really want to bit-pack subranges bigger then word size. But currently gpc packs them (the results are bogus...).
Depending on the implementation ultimately chosen, it might not be worth it, indeed. Though we could pack some "harmless" types (I remember someone once requiring about 24 bit packed types, for reasons of range vs. memory, and these types currently work), and we should probably warn about types we don't pack despite an explicit "packed" ...
Jonas Maebe wrote:
On 10 Sep 2006, at 13:11, CBFalconer wrote:
I don't understand why there is a problem. Packed array components are not directly accessible in any case, they have to be processed through the standard procedures pack and unpack anyhow.
Packed array components are directly accessible by indexing the packed array like a regular array (at least in both GPC and FPC). And I think it's even required for ISO Pascal since otherwise you would not be able to access the individual characters of a string by indexing strings, as I understand it (since their contents consists of a packed array of char).
That's right. ISO defines Pack and Unpack, but it's a misconception that packed arrays can be accessed only through these routines. There are a few restrictions, such as quoted below, but direct access in general is not forbidden.
: 6.7.3.3 Variable parameters : : [...] An actual variable parameter shall not denote a component of a : variable where that variable possesses a type that is designated : packed. [...] : : 6.7.3.7.3 Variable conformant arrays : : [...] An actualĂ‚Âparameter shall not denote a component of a variable : where that variable possesses a type that is designated packed.
Frank