Re: gdb, sets, big endian and 64 bit machines

2 Aug 2006


      On 1 aug 2006, at 23:53, Waldek Hebisch wrote:
...
I do not know what FPC is doing, but GPC also "aligns" sets, so
for example `set of 11..37' is stored as a subset of `set of 0..37'.
gdb has no idea of set alignment, so such set are printed wrong
even on little endian machines.
FPC does the same. I never noticed it because I rarely if ever use  
such non-zero based subrange types.
...
...
The main reasons for FPC to always use 32
bits are
a) binary compatibility
b) performance (even on 64 bit machines, loading/storing a 32 bit
value from/to memory is often faster)
I think that there is a subtle iteraction between space usage and
instruction count. For data-heavy program storing sets as byte
sequences will minimize space usage which may pay reducing cache
misses. OTOH 64 bit machines tend to have 64 bit buses, so working
in cache machine should handle 64 bit chunk as fast as 32 bit chunk.
If the set is aligned to 64 bit, and once the set is in the cache,  
possibly. At least the G5 has only two 32 bit busses from the cpu to  
memory (one for loads and one for stores).
Anyway, you could still do the adding/subbing/... of sets by  
typecasting the set to a MedCard array of the appropriate size. This  
issue is only with inserting/removing/testing elements (and in that  
case you're doing more or less random access, so there is less chance  
that the chunk you need is already in the cache).
...
In principle we can change set representation. In fact, there is one
change that I intend to do in the future: currently even the smalles
set use a MedCard sized word, I plan to allocate smaller space for
them (smallest unit which can represent them).
There have been plans for a long time to do that in FPC as well. If  
you do this as meticulously as Delphi does (its set size grows by 1  
byte at a time), then you more or less need by definition a little- 
endian representation of sets in all cases though (because of the  
left-over bytes at the end).
For this reason, a number of FPC developers are in favour of using  
"little endian" sets on all platforms. I'm a bit wary of breaking  
backwards binary compatibility though, and possibly also  
compatibility with other big endian Pascal compilers (does anyone  
know how CodeWarrior/PPC and/or Think Pascal store their sets?)
...
OTOH I am not convinced
that always using 32 bit chunks for sets is the best choice. GPC
on 64 bit machines uses 64 bit types in so many places that I doubt
in usefulness of having the same set representation
Possibly.
...
(and still
big endian machines would use different representation than little
endian).
Big and little endian machines using different representations is  
logical, and people porting from little to big endian (and vice  
versa) are used to add byte swapping code all over the place.
...
...
If someone sees a way to automatically detect the used set format
inside gdb itself, that would be great as well of course.
What do you think?
My first thought was to change GPC representation to match gdb, but
while we can rather easily (and with minor performance impact) change
bit order in sets, the alignment problem remains...
Well, those are two different issues, I think, which can be solved  
separately.
...
Technically in the compiler proper the change would be just to set
a few parameters to different value. The main change would be to
runtime support. Here we depend very much on set alignment (of
course removing alignment is doable, but we would get both lower
performing and more complicated code).
Indeed.
...
I am affraid that set representation affects not only FPC and GPC.
There is also GNU Modula-2 (would be nice to be able to have calling
convention compatibility with Modula-2). AFAIK now Modula-2 uses
default gcc (which is the same as gdb) representation.
Marco also already suggested to involve a Modula-2 developer.
...
In principle gdb could try to detect the compiler: gpc uses some
pretty characteristic symbols (like `_p_GPC_RTS_VERSION_20060215')
and I suspect that FPC is doing something similar.
There are indeed a lof of FPC_* symbols, but I don't really like such  
a solution, and I'm not sure whether the gdb people would like that  
either.
...
...
PS: I've attached a patch to gdb which fixes the problem for big
endian FPC and 32 bit GPC apps. I think it may still have to be
changed to not treat bitstrings differently from sets, because the
stabs docs (http://www.cygwin.com/stabs.html) state:
Note: my patch is not really correct. The reason is that gdb's bit  
testing routine checks whether the bit to be tested lies between the  
high and low bound of the set member type. So unless you have set of  
a type which has a multiple of 32 elements, you get a lot of "<error  
type>" errors from gdb even if it's zero-based (since my patch takes  
the 32-complement of the bit to be tested).
I had only quickly tested it with a "set of byte", for which it  
obviously worked properly.
Jonas

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: gdb, sets, big endian and 64 bit machines