Jonas Maebe wrote:
On 09 Sep 2006, at 23:25, Waldek Hebisch wrote:
Yes. However when packing 63 bit elements on 32 bit machine (or 127 bit elements on 64 bit machine) using two acceses causes trouble -- gpc implementation would need quadruple precision shifts.
I admit I did not consider that since FPC only supports subranges up to the native word size (so larger ordinal types will never be packed).
I do not know if I really want to bit-pack subranges bigger then word size. But currently gpc packs them (the results are bogus...).
Also, I would like to have possibility to implement packed array access via call to runtime support -- using different sizes requires more support code.
It doesn't, fortunately. The way the data is stored in memory means that you can access it using any size you want and still get the same results. FPC and GPC currently use different access sizes, yet the data has the same memory layout for both compilers (on both big and little endian systems).
For gpc this is more complicated: gpc generates C-like code which then goes trough gcc optimizers. Normally optimizers assume that memory is accessed only trough pointer of one type (hence one size), so there there is extra work involved to avoid mis-optimization. ATM for runtime calls this problem is only theoretical, but currently gcc developers work on doing global optimization (spanning translation units) and the problem may quickly become real.
Also, using different access size risks processor stalls on write buffers -- I am not sure if we care about such stalls when optimizing for space, but I would prefer to avoid them.
I must say that my idea to use byte access is mainly motivated by simplicity of implementation. Using two acceses and varying access size is probably better, but requires (slightly) more work. gpc had a number of bugs in its packed array implementation, so I tried to find out the simplest correct way...
The issue is not really the access size as explained above, but the (currently consequent) size of the entire packed array (as the subject indicates :) One solution might be to decouple them, although it's of course space-wasting to round up the size of a packed array to a multiple of 4 bytes if you're only going to use byte accesses anyway.
Size and alignment. The following record:
type pr = packed record b : boolean; a : packed array [0..1] of 0..1000000 end;
naively should take 7 bytes, but with word access it will need 12 bytes. And program which interprets record on disk must know alignment to find the data.
Since the memory usage grows both due to size and alignment I am reluctant to use layout allowing bigger access size without special need.