Re: gdb, sets, big endian and 64 bit

5 Aug 2006


      On 02 Aug 2006, at 22:16, Frank Heckenbach wrote:
...
Waldek Hebisch wrote:
...
There is still issue of alignment: we need alignment to avoid shifts
when lower bounds do not match, also set variables must be allocated
on proper (word) boundary. Looks messy.
I agree.
Allocation on a proper word boundary can be easily done by keeping  
the format as an array of MedCard, and only typecasting to something  
else inside the set/test/remove helpers.
I think the alignment issue could in that case be handled in the same  
way as deciding the size of sets mentioned by Waldek below (a  
tradeoff of size vs efficiency).
...
...
I would say proper big endian representation. Currently both GPC
and FPC use little endian bit order for set hunks, which conflicts
with big endian byte order.
Indeed, it's a bit strange as it is now.
I personally don't consider it strange since all cpu's I know of have  
the same endianess as far as bit ordering is concerned (regardless of  
their byte endianess). At least there's no architecture I know of  
where the byte with value 1 is represented as $80.
...
Using proper big endian
representation we'd be independent of word size (though not
alignment). The byte order would then be the same between big and
little endian machines, though not the bit order within a byte, so
no binary compatibility (but we don't have that now, so we wouldn't
lose anything).
Bit swapping when porting from big to little endian is a lot less  
obvious than byte swapping though. We'd at least have to provide  
routines to do this (maybe GPC has them already, but FPC doesn't).
That said, there are two big arguments in favour of using that  
solution (i.e., treating sets basically as packed bit arrays on big  
endian architectures):
a) gdb has a define called BITS_BIG_ENDIAN which is set to the same  
value as the byte endianess of the target architecture. If this  
define is set, it currently treats sets as packed bit arrays on big  
endian architectures (but not on little endian architectures, there  
it treats them the same way as FPC and GPC currently store their sets).
b) at least Think Pascal also uses this set format. I do not have MW  
Pascal to test against.
...
...
Using big endian bit order would
remove this discrepancy. However, I think about much simpler
scheme, trying to allocate 1, 2, 4, 8 bytes (in that order) and
if that fails allocating sequence of 8 byte words (all that assuming
64 bit machine, with obvious changes for other wordlengths). The main
reason is that with such a scheme one can perform operations on
small sets inline, using just a couple of instructions.
Seems reasonable.
To me as well overall, although I'm personally in favour of always  
using the same cut-off and extension sizes regardless of the native  
word size (e.g. 1, 2, 4, 8, 12, 16, ... everywhere) to keep same sets  
the same size on 32 and 64 bit archs.
Making people extend their set base types so the sets are a multiple  
of 8 bytes on both 32 and 64 bit archs seems awkward: it may mess up  
bit-packed records elsewhere as well, and for enums it may amount to  
adding a bunch of dummy enum elements (which doesn't look nice either).
...
If we can get the extensions working through the backend and gdb,
and perhaps even "officially" approved, I'd certainly prefer this.
Otherwise, the "alignment detection", though it seems a bit
backward, looks best to me.
I agree. Concerning M2: it uses a hack for (some?) larger sets (m2- 
valprint.c):
case TYPE_CODE_STRUCT:
       if (m2_is_long_set (type))
         m2_print_long_set (type, valaddr, embedded_offset, address,
                            stream, format, pretty);
else
         cp_print_value_fields (type, type, valaddr, embedded_offset,
                                address, stream, format,
                                recurse, pretty, NULL, 0);
m2_is_long_set checks if all the fields of the record are consecutive  
sets (i.e. sets of consecutive range types). I don't really  
understand why this is useful though, nor do I see at first sight  
what m2_print_long_set does so differently compared to the M2 print  
code for TYPE_CODE_SET.
But for some reason it gave me an easy idea to solve the gdb  
alignment problem we have: even if it's a set of 48..50 (which we  
will store as a set of 0..63), put in the debug info that it's a set  
of 0..50 and gdb will print everything correctly.
Jonas

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: gdb, sets, big endian and 64 bit