On 10 Sep 2006, at 01:24, Waldek Hebisch wrote:
For gpc this is more complicated: gpc generates C-like code which then goes trough gcc optimizers. Normally optimizers assume that memory is accessed only trough pointer of one type (hence one size), so there there is extra work involved to avoid mis- optimization.
That's only if you use -fstrict-aliasing afaik, and is only an issue here if a packed array is typecasted to different packed array types which use a different access size. Besides, users can also typecast it to a regular array (e.g. of integer) with a different access size, so you have to take this into account anyway.
Further, accessing memory with different sizes can happen in a lot of different ways (e.g. someone typecasts one array type to another, one record type into another, or even simply a variant record with two or more overlapping array fields). Did you have to add special code to avoid gcc mis-optimizing all those cases?
ATM gpc simply uses low-level equivalent of -fnostrict-aliasing (strict aliasing is a default). But AFAIK aliasing optimizations have significant effect (10-20% on SPEC), so I want to turn them on. Yes, I will need special code to avoid mis-optimizing casts (or declare them illegal, like C folks did).
Also, using different access size risks processor stalls on write buffers -- I am not sure if we care about such stalls when optimizing for space, but I would prefer to avoid them.
Such stalls only occur if you are accessing the same data using different access sizes, right? I don't think this is an issue, because a) GPC will always use the same access size for a particular kind of packed array, and so will FPC
My idea was to use runtime calls when optimizing for space and inline access when optimizing for speed. Different translation units may use different optimization options -- but access the same packed array.
So: there is no problem having different access size for different arrays, but there is a problem when runtime support uses different access size then the inline code.
The issue is not really the access size as explained above, but the (currently consequent) size of the entire packed array (as the subject indicates :) One solution might be to decouple them, although it's of course space-wasting to round up the size of a packed array to a multiple of 4 bytes if you're only going to use byte accesses anyway.
Size and alignment. The following record:
type pr = packed record b : boolean; a : packed array [0..1] of 0..1000000 end;
naively should take 7 bytes, but with word access it will need 12 bytes. And program which interprets record on disk must know alignment to find the data.
I think packed records are a special case: they should always result in minimal alignment (that's the whole point of using packed records). So I would not align that array either, and FPC will also use byte-sized accesses there (not yet implemented) and an alignment of one (already the case currently).
A special case of also having the size rounded up to the next byte is technically a little more difficult (in case it would normally be a multiple of 4 bytes), but doable if considered really necessary.
Well, ATM gpc uses normal alignment for some types which we consider too troublesome to pack (for example files ans schema). We use byte alignment for packed arrays inside packed records, but to generate correct code we should either use full alignment or byte access.
I agree that special casing packed arrays inside packed records is better, just requires more effort to handle correctly.