On 1 aug 2006, at 23:53, Waldek Hebisch wrote:
I do not know what FPC is doing, but GPC also "aligns" sets, so for example `set of 11..37' is stored as a subset of `set of 0..37'. gdb has no idea of set alignment, so such set are printed wrong even on little endian machines.
FPC does the same. I never noticed it because I rarely if ever use such non-zero based subrange types.
The main reasons for FPC to always use 32 bits are
a) binary compatibility b) performance (even on 64 bit machines, loading/storing a 32 bit value from/to memory is often faster)
I think that there is a subtle iteraction between space usage and instruction count. For data-heavy program storing sets as byte sequences will minimize space usage which may pay reducing cache misses. OTOH 64 bit machines tend to have 64 bit buses, so working in cache machine should handle 64 bit chunk as fast as 32 bit chunk.
If the set is aligned to 64 bit, and once the set is in the cache, possibly. At least the G5 has only two 32 bit busses from the cpu to memory (one for loads and one for stores).
Anyway, you could still do the adding/subbing/... of sets by typecasting the set to a MedCard array of the appropriate size. This issue is only with inserting/removing/testing elements (and in that case you're doing more or less random access, so there is less chance that the chunk you need is already in the cache).
In principle we can change set representation. In fact, there is one change that I intend to do in the future: currently even the smalles set use a MedCard sized word, I plan to allocate smaller space for them (smallest unit which can represent them).
There have been plans for a long time to do that in FPC as well. If you do this as meticulously as Delphi does (its set size grows by 1 byte at a time), then you more or less need by definition a little- endian representation of sets in all cases though (because of the left-over bytes at the end).
For this reason, a number of FPC developers are in favour of using "little endian" sets on all platforms. I'm a bit wary of breaking backwards binary compatibility though, and possibly also compatibility with other big endian Pascal compilers (does anyone know how CodeWarrior/PPC and/or Think Pascal store their sets?)
OTOH I am not convinced that always using 32 bit chunks for sets is the best choice. GPC on 64 bit machines uses 64 bit types in so many places that I doubt in usefulness of having the same set representation
Possibly.
(and still big endian machines would use different representation than little endian).
Big and little endian machines using different representations is logical, and people porting from little to big endian (and vice versa) are used to add byte swapping code all over the place.
If someone sees a way to automatically detect the used set format inside gdb itself, that would be great as well of course.
What do you think?
My first thought was to change GPC representation to match gdb, but while we can rather easily (and with minor performance impact) change bit order in sets, the alignment problem remains...
Well, those are two different issues, I think, which can be solved separately.
Technically in the compiler proper the change would be just to set a few parameters to different value. The main change would be to runtime support. Here we depend very much on set alignment (of course removing alignment is doable, but we would get both lower performing and more complicated code).
Indeed.
I am affraid that set representation affects not only FPC and GPC. There is also GNU Modula-2 (would be nice to be able to have calling convention compatibility with Modula-2). AFAIK now Modula-2 uses default gcc (which is the same as gdb) representation.
Marco also already suggested to involve a Modula-2 developer.
In principle gdb could try to detect the compiler: gpc uses some pretty characteristic symbols (like `_p_GPC_RTS_VERSION_20060215') and I suspect that FPC is doing something similar.
There are indeed a lof of FPC_* symbols, but I don't really like such a solution, and I'm not sure whether the gdb people would like that either.
PS: I've attached a patch to gdb which fixes the problem for big endian FPC and 32 bit GPC apps. I think it may still have to be changed to not treat bitstrings differently from sets, because the stabs docs (http://www.cygwin.com/stabs.html) state:
Note: my patch is not really correct. The reason is that gdb's bit testing routine checks whether the bit to be tested lies between the high and low bound of the set member type. So unless you have set of a type which has a multiple of 32 elements, you get a lot of "<error type>" errors from gdb even if it's zero-based (since my patch takes the 32-complement of the bit to be tested).
I had only quickly tested it with a "set of byte", for which it obviously worked properly.
Jonas