Hi: Took the Chief's sample, added one line, calling file obj1.pas; then: gpc -o obj obj1.pas
program obj1;
type a = object procedure p; end;
b = object (a) procedure p; virtual; end;
procedure a.p; begin writeln ('a'); end;
procedure b.p; begin writeln ('b'); end;
var foo : a; bar : b;
begin foo.p; bar.p; end.
Got this: obj1.pas:10: internal error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu-pascal.de/todo.html for instructions.
Using this: Reading specs from /usr/local/lib/gcc-lib/i586-pc-linux-gnu/3.2.2/specs Configured with: ../gcc-3.2.2/configure --enable-languages=pascal : (reconfigured) ../gcc-3.2.2/configure --enable-languages=pascal : (reconfigured) ../gcc-3.2.2/configure --enable-languages=pascal Thread model: posix gpc version 20030209, based on gcc-3.2.2
Russ
On 6 Jun 2003 at 20:31, Russell Whitaker wrote:
Hi: Took the Chief's sample, added one line, calling file obj1.pas; then: gpc -o obj obj1.pas
program obj1;
type a = object procedure p; end;
b = object (a) procedure p; virtual; end;
procedure a.p; begin writeln ('a'); end;
procedure b.p; begin writeln ('b'); end;
var foo : a; bar : b;
begin foo.p; bar.p; end.
Got this: obj1.pas:10: internal error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu-pascal.de/todo.html for instructions.
[...]
I got: "a" "b"
Using: gpc version 20030507, based on gcc-3.2.3 (mingw special 20030504- 1)
Best regards, The Chief -------- Prof. Abimbola A. Olowofoyeku (The African Chief) web: http://www.bigfoot.com/~african_chief/
Prof A Olowofoyeku (The African Chief) wrote:
On 6 Jun 2003 at 20:31, Russell Whitaker wrote:
Hi: Took the Chief's sample, added one line, calling file obj1.pas; then: gpc -o obj obj1.pas
Got this: obj1.pas:10: internal error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu-pascal.de/todo.html for instructions.
Using this: Reading specs from /usr/local/lib/gcc-lib/i586-pc-linux-gnu/3.2.2/specs Configured with: ../gcc-3.2.2/configure --enable-languages=pascal : (reconfigured) ../gcc-3.2.2/configure --enable-languages=pascal : (reconfigured) ../gcc-3.2.2/configure --enable-languages=pascal Thread model: posix gpc version 20030209, based on gcc-3.2.2
[...]
I got: "a" "b"
Using: gpc version 20030507, based on gcc-3.2.3 (mingw special 20030504-
I don't have 20030209 handy now. I suppose this is a bug that was fixed meanwhile. If you want to be sure, Russ, you might want to provide a stack trace (`i s' in gdb).
Frank
On Sat, 7 Jun 2003, Frank Heckenbach wrote:
Prof A Olowofoyeku (The African Chief) wrote:
I got: "a" "b"
Upgraded to gpc-200305 with gpc-3.2.2; now get same results.
Incidently, small change for line 42 of p/config-lang.in, suggested by someone else a while back, was overlooked in this release. The change is needed for gpc to build with gcc-3.2.3 and since the 3.2.x branch is now closed, suggest:
from (this is actually one line): if echo $version | grep '3.2.[3-9]' > /dev/null || echo $version | grep '3.[3-9]' > /dev/null; then
to: if echo $version | grep '3.[3-9]' > /dev/null; then
Russ
Russell Whitaker wrote:
Incidently, small change for line 42 of p/config-lang.in, suggested by someone else a while back, was overlooked in this release. The change is needed for gpc to build with gcc-3.2.3 and since the 3.2.x branch is now closed, suggest:
from (this is actually one line): if echo $version | grep '3.2.[3-9]' > /dev/null || echo $version | grep '3.[3-9]' > /dev/null; then
to: if echo $version | grep '3.[3-9]' > /dev/null; then
OK. Will do so in the next release.
Frank
On Sun, 8 Jun 2003, Frank Heckenbach wrote:
Russell Whitaker wrote:
Incidently, small change for line 42 of p/config-lang.in, suggested by someone else a while back, was overlooked in this release. The change is needed for gpc to build with gcc-3.2.3 and since the 3.2.x branch is now closed, suggest:
from (this is actually one line): if echo $version | grep '3.2.[3-9]' > /dev/null || echo $version | grep '3.[3-9]' > /dev/null; then
to: if echo $version | grep '3.[3-9]' > /dev/null; then
OK. Will do so in the next release.
Starting with gpc-20030507.tar.gz gcc-3.2.2.tar.gz gcc-3.2.3.tar.gz
Building gpc with gcc-3.2.2 works just fine. Starting with annother copy of gpc*gz and gcc-3.2.3 the above patch gets me past the first problem. However, the build stops with:
make[1]: Circular libgcc.a <- pascal dependency dropped. /usr/local/src/gpc-20030507/gcc-3.2.3/gcc/p/rts/numtodec.pas:273: Internal compiler error in dwarf2out_finish, at dwarf2out.c:12228 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu-pascal.de/todo.html for instructions. make[2]: *** [numtodec.o] Error 1 make[1]: *** [pascal.rts] Error 2 make: *** [all-gcc] Error 2
If you have a patch will try it.
Russ
Russell Whitaker wrote:
On Sun, 8 Jun 2003, Frank Heckenbach wrote:
Russell Whitaker wrote:
Incidently, small change for line 42 of p/config-lang.in, suggested by someone else a while back, was overlooked in this release. The change is needed for gpc to build with gcc-3.2.3 and since the 3.2.x branch is now closed, suggest:
from (this is actually one line): if echo $version | grep '3.2.[3-9]' > /dev/null || echo $version | grep '3.[3-9]' > /dev/null; then
to: if echo $version | grep '3.[3-9]' > /dev/null; then
OK. Will do so in the next release.
Starting with gpc-20030507.tar.gz gcc-3.2.2.tar.gz gcc-3.2.3.tar.gz
Building gpc with gcc-3.2.2 works just fine. Starting with annother copy of gpc*gz and gcc-3.2.3 the above patch gets me past the first problem. However, the build stops with:
make[1]: Circular libgcc.a <- pascal dependency dropped. /usr/local/src/gpc-20030507/gcc-3.2.3/gcc/p/rts/numtodec.pas:273: Internal compiler error in dwarf2out_finish, at dwarf2out.c:12228 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu-pascal.de/todo.html for instructions. make[2]: *** [numtodec.o] Error 1 make[1]: *** [pascal.rts] Error 2 make: *** [all-gcc] Error 2
Why do you expect and older GPC version to work with a new GCC version? The warning in config-lang.in is there for a reason.
Frank
On Sun, 15 Jun 2003, Frank Heckenbach wrote:
Russell Whitaker wrote:
On Sun, 8 Jun 2003, Frank Heckenbach wrote:
Russell Whitaker wrote:
Incidently, small change for line 42 of p/config-lang.in, suggested by someone else a while back, was overlooked in this release. The change is needed for gpc to build with gcc-3.2.3 and since the 3.2.x branch is now closed, suggest:
from (this is actually one line): if echo $version | grep '3.2.[3-9]' > /dev/null || echo $version | grep '3.[3-9]' > /dev/null; then
to: if echo $version | grep '3.[3-9]' > /dev/null; then
OK. Will do so in the next release.
Starting with gpc-20030507.tar.gz gcc-3.2.2.tar.gz gcc-3.2.3.tar.gz
Building gpc with gcc-3.2.2 works just fine. Starting with annother copy of gpc*gz and gcc-3.2.3 the above patch gets me past the first problem. However, the build stops with:
make[1]: Circular libgcc.a <- pascal dependency dropped. /usr/local/src/gpc-20030507/gcc-3.2.3/gcc/p/rts/numtodec.pas:273: Internal compiler error in dwarf2out_finish, at dwarf2out.c:12228 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu-pascal.de/todo.html for instructions. make[2]: *** [numtodec.o] Error 1 make[1]: *** [pascal.rts] Error 2 make: *** [all-gcc] Error 2
Why do you expect and older GPC version to work with a new GCC version? The warning in config-lang.in is there for a reason.
Because gpc-20030507.tar.gz is the latest version I can find in directory www.gnu-pascal.de/alpha
Russ
Russell Whitaker wrote:
Starting with gpc-20030507.tar.gz gcc-3.2.2.tar.gz gcc-3.2.3.tar.gz
Building gpc with gcc-3.2.2 works just fine. Starting with annother copy of gpc*gz and gcc-3.2.3 the above patch gets me past the first problem. However, the build stops with:
make[1]: Circular libgcc.a <- pascal dependency dropped. /usr/local/src/gpc-20030507/gcc-3.2.3/gcc/p/rts/numtodec.pas:273: Internal compiler error in dwarf2out_finish, at dwarf2out.c:12228 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu-pascal.de/todo.html for instructions. make[2]: *** [numtodec.o] Error 1 make[1]: *** [pascal.rts] Error 2 make: *** [all-gcc] Error 2
Why do you expect and older GPC version to work with a new GCC version? The warning in config-lang.in is there for a reason.
Because gpc-20030507.tar.gz is the latest version I can find in directory www.gnu-pascal.de/alpha
Oh, you mean also gpc-20030507. When I read "another copy of gpc*gz", I thought you meant another (and then, of course, earlier) GPC release (because of the `*').
(BTW, GPC now allows to use the same source tree for building with different GCC versions if you symlink the `p' directory into each `gcc' directory.)
Frank
Hi all,
In the Mac OS, we have a concept of a FourCharCode, which is essentially a UInt32 except that it often has values that are made up of four characters. In Mac OS compilers then, the UNSIGNEDLONG (Cardinal attribute( size = 32 ))is compatible as a special case with constant strings of four characters.
In GPC, we've implemented FourCharCode = packed array[0..3] of char which works for most things, but it is not possible to do a case statement of this type, nor is it possible to cast between UInt32 and FourCharCode (which would provide a workaround, as the case statement could case the value and constants to UInt32 (except GPC does not allow casting in the case constants part, but even that can be worked around).
So basically, I'm looking for recommendations as to the best way to implement compatibility with this inside GPC, and/or the best extensions to GPC to allow compatibility with this sort of code.
A typical example might be:
const typeBoolean = 'bool'; typeChar = 'TEXT'; typeSInt16 = 'shor'; typeSInt32 = 'long'; typeUInt32 = 'magn'; etc
So it ends up being used as an extensible enumerated type if you like. Anyway, this concept is used all over the Mac OS API, so it is important to have reasonable compatibility for writin Mac OS applications.
Ok, so can anyone think of the best way to get this functionality?
Changing FourCharCode = UInt32 is probably better since the concept behind it is as a fixed size low cost use, and it is essentially never used as a string as such. The problem then comes from the fact that you can't case the constants above to UInt32's, eg:
typeBoolean = UInt32('bool');
this doesn't work.
One possibility might be to extend BP's # character hack to work with quoted strings in reverse, so #'bool' would return an unsigned integer constant equal to ((((ord('b)*256)_+ord('o'))*256+ord('o'))*256+ord('l').
Another might be to extend ord to allow ord('bool'), or perhaps ord4('bool').
I can hack the universal interfaces to convert to a UInt32 manually:
const typeBoolean = $626F6F6C;
but that looses a lot in the translation and puts a large onus of work on the end users for all the constants like this that they create.
I'm certainly open to hear of any existing GPC solutions that we could use instead.
Any ideas or comments?
Thanks, Peter.
Peter N Lewis wrote:
In the Mac OS, we have a concept of a FourCharCode, which is essentially a UInt32 except that it often has values that are made up of four characters. In Mac OS compilers then, the UNSIGNEDLONG (Cardinal attribute( size = 32 ))is compatible as a special case with constant strings of four characters.
[...]
Changing FourCharCode = UInt32 is probably better since the concept behind it is as a fixed size low cost use, and it is essentially never used as a string as such.
I agree. It's probably the only way it can be used in `case' as well.
The problem then comes from the fact that you can't case the constants above to UInt32's, eg:
typeBoolean = UInt32('bool');
this doesn't work.
One possibility might be to extend BP's # character hack to work with quoted strings in reverse, so #'bool' would return an unsigned integer constant equal to ((((ord('b)*256)_+ord('o'))*256+ord('o'))*256+ord('l').
[...]
I'm certainly open to hear of any existing GPC solutions that we could use instead.
I thought of a macro encapsulating the code above.
{$define UInt32(s) ((((Ord (s[1]) * 256) + Ord (s[2])) * 256 + Ord (s[3])) * 256 + Ord (s[4]))}
Then you can do:
const c = 'abcd';
WriteLn (UInt32 (c));
Unfortunately, this doesn't work:
WriteLn (UInt32 ('abcd'));
because after macro expansion it will contain things like 'abcd'[1] which are syntactically invalid (in any Pascal dialect AFAIK).
I'm not sure yet how difficult or desirable it would be to allow this syntax. But perhaps someone has an idea for a less intrusive way ...
Frank
Frank Heckenbach wrote:
Peter N Lewis wrote:
In the Mac OS, we have a concept of a FourCharCode, which is essentially a UInt32 except that it often has values that are made up of four characters. In Mac OS compilers then, the UNSIGNEDLONG (Cardinal attribute( size = 32 ))is compatible as a special case with constant strings of four characters.
[...]
Changing FourCharCode = UInt32 is probably better since the concept behind it is as a fixed size low cost use, and it is essentially never used as a string as such.
I agree. It's probably the only way it can be used in `case' as well.
The problem then comes from the fact that you can't case the constants above to UInt32's, eg:
typeBoolean = UInt32('bool');
this doesn't work.
One possibility might be to extend BP's # character hack to work with quoted strings in reverse, so #'bool' would return an unsigned integer constant equal to ((((ord('b)*256)_+ord('o'))*256+ord('o'))*256+ord('l').
[...]
I'm certainly open to hear of any existing GPC solutions that we could use instead.
I thought of a macro encapsulating the code above.
{$define UInt32(s) ((((Ord (s[1]) * 256) + Ord (s[2])) * 256 + Ord (s[3])) * 256 + Ord (s[4]))}
I'm not sure if GPC supports it but Extended Pascal provides for using the substr required function in constant-expressions as long as none of the parameters are variable-access. In theory, a macro such as:
{$define UInt32(s) ((Ord (substr((s),1,1)) shl 24) or (Ord (substr((s),2,1)) shl 16) or (Ord (substr((s),3,1)) shl 8) or Ord (substr((s),4,1))}
should work for s as a constant-string-literal or as a constant-identifier declared as a string-literal (presuming, of course, the string has at least four characters).
However, when I tried the basic concept in a test program, I got a "internal compiler error: Bus error" result.
The test program was:
program FourCharCodeTest (input, output);
const ch1value = ord(substr('bool',1,1)); begin writeln('The value of ''b'' is: ', ch1value:1); end.
The command line used and the resulting output was:
[Gale-Paepers-Computer:~/programming/FourCharCodeTest] galepaep% gpc --automake -v -save-temps FourCharCodeTest.pas -o FourCharCodeTest Reading specs from /Developer/Pascal/gpc33d6/lib/gcc-lib/powerpc-apple-darwin/3.3/specs Configured with: ../gpc-3.3d6/configure --enable-languages=pascal,c --enable-threads=posix --prefix=/Developer/Pascal/gpc33d6 --target=powerpc-apple-darwin Thread model: posix gpc version 20030507, based on gcc-3.3 /Developer/Pascal/gpc33d6/lib/gcc-lib/powerpc-apple-darwin/3.3/gpcpp -D__BITS_BIG_ENDIAN__=1 -D__BYTES_BIG_ENDIAN__=1 -D__WORDS_BIG_ENDIAN__=1 -D__NEED_NO_ALIGNMENT__=1 -quiet -v -iprefix /usr/bin/../lib/gcc-lib/powerpc-apple-darwin/3.3/ -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=0 -D__DYNAMIC__ FourCharCodeTest.pas -fautomake -famtmpfile=/var/tmp//cc51gu9u.gpa FourCharCodeTest.i GNU Pascal Compiler PreProcessor version 20030507, based on gcc-3.3 (Darwin/PowerPC) /Developer/Pascal/gpc33d6/lib/gcc-lib/powerpc-apple-darwin/3.3/gpc1 FourCharCodeTest.i -fPIC -quiet -dumpbase FourCharCodeTest.pas -auxbase FourCharCodeTest -version -fautomake -famtmpfile=/var/tmp//cc51gu9u.gpa -o FourCharCodeTest.s GNU Pascal version is actually 20030507, based on gcc-3.3 GNU Pascal version 3.3 (powerpc-apple-darwin) compiled by GNU C version 3.3. GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 FourCharCodeTest.pas:4: internal compiler error: Bus error Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions.
Then you can do:
const c = 'abcd';
WriteLn (UInt32 (c));
Unfortunately, this doesn't work:
WriteLn (UInt32 ('abcd'));
because after macro expansion it will contain things like 'abcd'[1] which are syntactically invalid (in any Pascal dialect AFAIK).
If it actually worked, the substr function would work in both cases since the string parameter can be any string-type (or char-type) expression - e.g., a constant string-literal, a constant-identifier of string-type, etc.
I think for Mac OS X, a more typical useage for such a macro fix would be using the macro in the declaration of the constant instead of using the macro every place the constant identifier is used in the code, i.e., :
const c = UInt32 ('abcd');
The reason being that for interface declarations Apple added for PPC only useage, it was assumed that the FourCharCode type was the same base type as UInt32 so there a plenty of places where parameter types and record field types intended to take FourCharCode types are just declared as UInt32 types. Since there are quite a few FourCharCode constants declared, it would lessen the burden of applying the FourCharCode macro fix if the fix can be applied to the declaration of the constants instead of every where in the code the constants are used.
Gale Paeper gpaeper@empirenet.com
Gale Paeper wrote:
However, when I tried the basic concept in a test program, I got a "internal compiler error: Bus error" result.
The test program was:
program FourCharCodeTest (input, output);
const ch1value = ord(substr('bool',1,1));
begin writeln('The value of ''b'' is: ', ch1value:1); end.
The backtrace looks like this:
Date/Time: 2003-06-25 22:22:04 +0200 OS Version: 10.2.4 (Build 6I32) Host: G4.local.
Command: gpc1 PID: 2866
Exception: EXC_BAD_ACCESS (0x0001) Codes: KERN_PROTECTION_FAILURE (0x0002) at 0x00000004
Thread 0 Crashed: #0 0x000b3144 in expand_expr_stmt_value (stmt.c:2155) #1 0x0003a384 in init_any (statements.c:913) #2 0x00008580 in make_new_variable (declarations.c:3001) #3 0x000085ec in new_string_by_model (declarations.c:3023) #4 0x000363d8 in build_predef_call (predef.c:2480) #5 0x0002dc90 in main_yyparse (parse.c:2489) #6 0x00096e88 in compile_file (toplev.c:2134) #7 0x0009c03c in do_compile (toplev.c:5370) #8 0x0009c108 in toplev_main (toplev.c:5400) #9 0x000018a8 in _start (crt.c:267) #10 0x00001728 in start
PPC Thread State: srr0: 0x000b3144 srr1: 0x0000f930 vrsave: 0x00000000 xer: 0x20000000 lr: 0x000b2f20 ctr: 0x0009c588 mq: 0x00000000 r0: 0x0003a384 r1: 0xbffff2e0 r2: 0x00000000 r3: 0x00cb4150 r4: 0xffffffff r5: 0x00000001 r6: 0x00000000 r7: 0x00000000 r8: 0x00400000 r9: 0x002db218 r10: 0x00000001 r11: 0x00000000 r12: 0x28442288 r13: 0x28442228 r14: 0x00000000 r15: 0x00000000 r16: 0x00000000 r17: 0x00cb2e28 r18: 0x00cb2df8 r19: 0x00000000 r20: 0x00000000 r21: 0x48400228 r22: 0x0029e150 r23: 0x00cb2de0 r24: 0x00cb2d38 r25: 0x48400228 r26: 0x00000000 r27: 0x00000000 r28: 0x00cb3ee0 r29: 0x00cb4150 r30: 0x00cb5000 r31: 0x000b2f20
Gale Paeper wrote:
Frank Heckenbach wrote:
I thought of a macro encapsulating the code above.
{$define UInt32(s) ((((Ord (s[1]) * 256) + Ord (s[2])) * 256 + Ord (s[3])) * 256 + Ord (s[4]))}
I'm not sure if GPC supports it but Extended Pascal provides for using the substr required function in constant-expressions as long as none of the parameters are variable-access. In theory, a macro such as:
{$define UInt32(s) ((Ord (substr((s),1,1)) shl 24) or (Ord (substr((s),2,1)) shl 16) or (Ord (substr((s),3,1)) shl 8) or Ord (substr((s),4,1))}
should work for s as a constant-string-literal or as a constant-identifier declared as a string-literal (presuming, of course, the string has at least four characters).
Ah, good idea! This patch should fix the crash and allow `SubStr' (and `Copy') with constant arguments in constant expressions (gale1.pas). (As usual, I've made other changes meanwhile, so I hope it will fit the 20030507 sources.)
With this fix, I think this macro should work. BTW, I suggest to use `Ord (SubStr ((s), 4))' without a 3rd argument. This way you'll get a compile-time error if the argument is longer than 5 characters (because then the argument to `Ord' will not be a character).
Frank
Frank Heckenbach wrote:
Gale Paeper wrote:
Frank Heckenbach wrote:
I thought of a macro encapsulating the code above.
{$define UInt32(s) ((((Ord (s[1]) * 256) + Ord (s[2])) * 256 + Ord (s[3])) * 256 + Ord (s[4]))}
I'm not sure if GPC supports it but Extended Pascal provides for using the substr required function in constant-expressions as long as none of the parameters are variable-access. In theory, a macro such as:
{$define UInt32(s) ((Ord (substr((s),1,1)) shl 24) or (Ord (substr((s),2,1)) shl 16) or (Ord (substr((s),3,1)) shl 8) or Ord (substr((s),4,1))}
should work for s as a constant-string-literal or as a constant-identifier declared as a string-literal (presuming, of course, the string has at least four characters).
Ah, good idea! This patch should fix the crash and allow `SubStr' (and `Copy') with constant arguments in constant expressions (gale1.pas). (As usual, I've made other changes meanwhile, so I hope it will fit the 20030507 sources.)
Thanks for the speedy fix. Without trying the patch yet, I think the other changes might also be needed. When I tried changing the code to a run-time expression evaluation instead of a compile-time constant-expression evaluation to avoid the internal compiler error, I got an error message which in effect said the result of substr((s),1,1) wasn't a ordinal type and therefore the useage of substr((s),1,1) as an argument for the ord function was illegal. Changing it to:
ord(substr((s),1,1)[1])
did make the "not an ordinal type" problem go away. So, those trying to use the macro along with the patch may need to supplement the macro with string array indexing.
Gale Paeper gpaeper@empirenet.com
Gale Paeper wrote:
Thanks for the speedy fix. Without trying the patch yet, I think the other changes might also be needed. When I tried changing the code to a run-time expression evaluation instead of a compile-time constant-expression evaluation to avoid the internal compiler error, I got an error message which in effect said the result of substr((s),1,1) wasn't a ordinal type and therefore the useage of substr((s),1,1) as an argument for the ord function was illegal. Changing it to:
ord(substr((s),1,1)[1])
did make the "not an ordinal type" problem go away. So, those trying to use the macro along with the patch may need to supplement the macro with string array indexing.
This makes me wonder if this should be allowed at all (including parts of the previous change):
: substr(s, i, j) : : From the expression s that shall be of charÂÂtype or a stringÂÂtype and from : the expressions i and j that shall be of integerÂÂtype, this function shall : return a result of the canonicalÂÂstringÂÂtype.
: ord(x) : : From the expression x that shall be of an ordinalÂÂtype, [...]
And the canonicalÂÂstringÂÂtype, even values of length 1, is not an ordinalÂÂtype, AFAIK. I find this:
: Each stringÂÂtype value is a : value of the canonicalÂÂstringÂÂtype.
: NOTE --- 3 CharÂÂtype values possess properties that allow them to be : used identically to stringÂÂtype values of length 1. In particular,
but not vice verse.
Am I missing something?
Frank
Frank Heckenbach wrote:
Gale Paeper wrote:
Thanks for the speedy fix. Without trying the patch yet, I think the other changes might also be needed. When I tried changing the code to a run-time expression evaluation instead of a compile-time constant-expression evaluation to avoid the internal compiler error, I got an error message which in effect said the result of substr((s),1,1) wasn't a ordinal type and therefore the useage of substr((s),1,1) as an argument for the ord function was illegal. Changing it to:
ord(substr((s),1,1)[1])
did make the "not an ordinal type" problem go away. So, those trying to use the macro along with the patch may need to supplement the macro with string array indexing.
This makes me wonder if this should be allowed at all (including parts of the previous change):
To clarify, you are questioning the legality of just ord(substr((s),1,1))?
If so, after posting my original internal compiler error message, it dawned on me that perhaps I was pushing the type compatibility rules too far and that the error was due to stressing the compiler with possibly an illegal construct which it was unable to deal with. Therefore, to remove uncetainties about char-type string-type type compatibilities and the ord function, I tried the array indexed form for the constant-expression but, unfortunately, that form also resulted in an internal compiler error. This prompted the run-time expression investigation and the above reported results.
Given the mixture of error results, I was in the process of determining exactly what should be legal according to standard requirements when I saw the fix patch posting which in essense said problem solved so I didn't pursue the legality question any further.
: substr(s, i, j) : : From the expression s that shall be of char–type or a string–type and from : the expressions i and j that shall be of integer–type, this function shall : return a result of the canonical–string–type.
: ord(x) : : From the expression x that shall be of an ordinal–type, [...]
I think "expression" is the key for sorting out what is legal since that rules out using the compatible types and assignment compatibility rules which provides for compatibility of char-type and string-types in parameters.
And the canonical–string–type, even values of length 1, is not an ordinal–type, AFAIK. I find this:
: Each string–type value is a : value of the canonical–string–type.
: NOTE --- 3 Char–type values possess properties that allow them to be : used identically to string–type values of length 1. In particular,
but not vice verse.
Am I missing something?
I don't think you are. Since the ord argument requirement is specified with expression and not parameter terms and the expression "substr((s),1,1)" is of type canonical–string–type which isn't an ordinal-type, "substr((s),1,1)" is an illegal argument for ord since it doesn't satisfy the ordinal-type expression requirement. ("substr((s),1,1)[1]" is legal since the individual components of string-types are char-types which is an ordinal-type and thus the expression is of ordinal-type.)
You're still left with two constant-expression internal compiler errors which need to be fixed.
1. For constant-expression ord(substr((s),1,1)), detecting that it isn't an ordinal expression and issuing an appropriate error message.
2. For constant-expression ord(substr((s),1,1)[1]), evaluating the expression with a result of the ordinal-value of the first character in the constant-string "s".
Gale Paeper gpaeper@empirenet.com
Gale Paeper wrote:
To clarify, you are questioning the legality of just ord(substr((s),1,1))?
Yes (my reference was unclear, indeed).
If so, after posting my original internal compiler error message, it dawned on me that perhaps I was pushing the type compatibility rules too far and that the error was due to stressing the compiler with possibly an illegal construct which it was unable to deal with. Therefore, to remove uncetainties about char-type string-type type compatibilities and the ord function, I tried the array indexed form for the constant-expression but, unfortunately, that form also resulted in an internal compiler error. This prompted the run-time expression investigation and the above reported results.
Given the mixture of error results, I was in the process of determining exactly what should be legal according to standard requirements when I saw the fix patch posting which in essense said problem solved so I didn't pursue the legality question any further.
At that time I thought it was ok. When dealing with string constants, it's known at compile time whether or not they have length 1 (i.e., whether or not they could, apparently, be a Char value), so all seemed well. With the clarifications below, this part actually gets more difficult, since there are string constants of length 1 that can or cannot be Char values ...
But for non-constant expressions, things get easier, since it would appear strange if it wasn't known until runtime whether they were ordinal values and would have added a lot of checks (not only in Ord, also in Succ, Pred, High, Low, FillChar, Include, Exclude, Seek..., and perhaps more).
Am I missing something?
I don't think you are. Since the ord argument requirement is specified with expression and not parameter terms and the expression "substr((s),1,1)" is of type canonical–string–type which isn't an ordinal-type, "substr((s),1,1)" is an illegal argument for ord since it doesn't satisfy the ordinal-type expression requirement. ("substr((s),1,1)[1]" is legal since the individual components of string-types are char-types which is an ordinal-type and thus the expression is of ordinal-type.)
You're still left with two constant-expression internal compiler errors which need to be fixed.
- For constant-expression ord(substr((s),1,1)), detecting that it isn't
an ordinal expression and issuing an appropriate error message.
(gale1b.pas)
- For constant-expression ord(substr((s),1,1)[1]), evaluating the
expression with a result of the ordinal-value of the first character in the constant-string "s".
(gale1a.pas)
(Also adding gale1[cd].pas for variables instead of constants.)
I'm not sending a patch now, because there were several changes, and I don't remember exactly which ones were necessary or might have broken other things I've fixed meanwhile. I hope it won't be too long till the next release.
Please check if the test programs seem alright now.
Frank
Frank Heckenbach wrote:
Gale Paeper wrote:
To clarify, you are questioning the legality of just ord(substr((s),1,1))?
Yes (my reference was unclear, indeed).
[snip]
Given the mixture of error results, I was in the process of determining exactly what should be legal according to standard requirements when I saw the fix patch posting which in essense said problem solved so I didn't pursue the legality question any further.
At that time I thought it was ok. When dealing with string constants, it's known at compile time whether or not they have length 1 (i.e., whether or not they could, apparently, be a Char value), so all seemed well. With the clarifications below, this part actually gets more difficult, since there are string constants of length 1 that can or cannot be Char values ...
After examining some of the implementation mechanics in this area, I think there is a fundamental disconnect in the type system created when literal string constants of the one string-element form (e.g., 'c', '''', etc.) are classified as LEX_STRCONST in the lexer. (This probably applies to the Borland #20 and ^I character constant extensions also.)
The language rule distinguishing between char type and string type for literals is based solely upon lexical context. In discarding the distinguishment at the lexer level, the information necessary to distinguish between char-type and string-type classification for one char element entities is lost and there is no language rule available outside the lexical context which can reliably be used to reconstruct the lost information. In ISO 7185 with the limited constant declaration and string-type capabilities, one probably could deduce (based on the inferences in the constrained constant construction possibilities) the type based on string length outside the lexical context; however, this isn't possible in ISO 10206 since there are a multitude of ways to construct constant one char element entities of string-type which are not also char-types.
I think if one investigates all the Pascal language constructs where constants of ordinal-type are required (and constant string-type usage is illegal), there will be a quite a few "difficult parts" in the compiler as long as constants of char-type are categorized as constants of string-type at the lex level. A simple search on "string_may_be_char" turns up quite a few instances which look like they may have problems in the constant char-type versus constant string-type area - case constants, subrange types, set type constructors to name a few. (With the present GPC limitations in supporting some ISO 10206 constructs for constant declarations, I don't think I can construct working test cases to demonstrate (or check for) problems in this area so this is an observation based upon my inspection the compiler code. I don't profess to have a expert understanding of the compiler internals so I could be mistaken on the Pascal code effects.)
But for non-constant expressions, things get easier, since it would appear strange if it wasn't known until runtime whether they were ordinal values and would have added a lot of checks (not only in Ord, also in Succ, Pred, High, Low, FillChar, Include, Exclude, Seek..., and perhaps more).
I wouldn't be too hasty in making generalizations of ord's argument type requirement being specified in expression terms. Although most of the ISO required functions' argument type requirements are specified in expression terms, there are some which are specified using terminology which opens up a route to applying type compatibility rules with char-type/string-type compatibility implications. With the ISO required procedures, it is even more of a mixed bag of requirements. (For the non-ISO required procedure/function extensions, that is a whole different bucket of worms with imprecise teminology/non-authoritive documentation and mismatches between implementation and documentation thrown in to make the bucket more "interesting".)
[snip]
I'm not sending a patch now, because there were several changes, and I don't remember exactly which ones were necessary or might have broken other things I've fixed meanwhile. I hope it won't be too long till the next release.
Not a problem. In general, my preference in the fix department is to take the time necessary to ensure the fix is the correct means to rectify a problem. In the long term, a hasty fix may end up taking more time to get a correctly working solution than the time it would take if one spent the necessary time upfront to get it correct on the first try.
Please check if the test programs seem alright now.
The test programs, gale1[abcd].pas, look alright.
Gale Paeper gpaeper@empirenet.com
Gale Paeper wrote:
After examining some of the implementation mechanics in this area, I think there is a fundamental disconnect in the type system created when literal string constants of the one string-element form (e.g., 'c', '''', etc.) are classified as LEX_STRCONST in the lexer. (This probably applies to the Borland #20 and ^I character constant extensions also.)
The language rule distinguishing between char type and string type for literals is based solely upon lexical context.
For literals? String literals of length 1 are just the same as char literals, aren't they? (BP char literals are not governed by the standards, but it seems reasonable to treat them the same as those, and that seems to be what BP does.)
In discarding the distinguishment at the lexer level, the information necessary to distinguish between char-type and string-type classification for one char element entities is lost and there is no language rule available outside the lexical context which can reliably be used to reconstruct the lost information.
For literals, AFAICS, the condition length = 1 (or length <= 1 in EP) is all that's required.
In ISO 7185 with the limited constant declaration and string-type capabilities, one probably could deduce (based on the inferences in the constrained constant construction possibilities) the type based on string length outside the lexical context; however, this isn't possible in ISO 10206 since there are a multitude of ways to construct constant one char element entities of string-type which are not also char-types.
So the important difference is between literals and other string constants. In GPC, this is expressed by the flag `PASCAL_TREE_FRESH_CST' (maybe it would be reasonable to rename it to something with `LITERAL'), and that's in fact what I used in the latter fix. So a constant result of `SubStr' etc. now does not have this flag set, and `string_may_be_char' checks this flag.
(With the present GPC limitations in supporting some ISO 10206 constructs for constant declarations, I don't think I can construct working test cases to demonstrate (or check for) problems in this area so this is an observation based upon my inspection the compiler code. I don't profess to have a expert understanding of the compiler internals so I could be mistaken on the Pascal code effects.)
I suppose you mean thing like this (which doesn't work in the next GPC anymore):
program Foo; begin case 'x' of SubStr ('abc', 1, 1) .. 'z': end end.
But for non-constant expressions, things get easier, since it would appear strange if it wasn't known until runtime whether they were ordinal values and would have added a lot of checks (not only in Ord, also in Succ, Pred, High, Low, FillChar, Include, Exclude, Seek..., and perhaps more).
I wouldn't be too hasty in making generalizations of ord's argument type requirement being specified in expression terms. Although most of the ISO required functions' argument type requirements are specified in expression terms, there are some which are specified using terminology which opens up a route to applying type compatibility rules with char-type/string-type compatibility implications. With the ISO required procedures, it is even more of a mixed bag of requirements.
I did some checks recently. I found that `Pack' and `Unpack' allow for assignment-compatibility, and that's covered by the 2nd parameter of `string_may_be_char'.
[snip]
I'm not sending a patch now, because there were several changes, and I don't remember exactly which ones were necessary or might have broken other things I've fixed meanwhile. I hope it won't be too long till the next release.
Not a problem. In general, my preference in the fix department is to take the time necessary to ensure the fix is the correct means to rectify a problem. In the long term, a hasty fix may end up taking more time to get a correctly working solution than the time it would take if one spent the necessary time upfront to get it correct on the first try.
Agreed completely. Sometimes I send quick patches when the chances are good that they have no regressions (but sometimes that turns out wrong, as in the first gale1 patch).
It should be understood by anyone who tries them that such patches are to be used with caution and tested well. (A complete test run of the test suite and my own code base on various platforms takes 2-3 days, so I can only do it before releases).
Frank
Frank Heckenbach wrote:
Gale Paeper wrote:
After examining some of the implementation mechanics in this area, I think there is a fundamental disconnect in the type system created when literal string constants of the one string-element form (e.g., 'c', '''', etc.) are classified as LEX_STRCONST in the lexer. (This probably applies to the Borland #20 and ^I character constant extensions also.)
The language rule distinguishing between char type and string type for literals is based solely upon lexical context.
For literals? String literals of length 1 are just the same as char literals, aren't they?
With a quibble or two, yes. (The quibbling would be over using a more precise terminology to avoid misunderstandings with the apostrophe-image case.)
(BP char literals are not governed by the standards, but it seems reasonable to treat them the same as those, and that seems to be what BP does.)
Seems like a reasonable treatment to me also. But as you say they aren't governed by the standards, so there isn't anything authoritive to fall back on when there problems (vague documentation, buggy implemention(s), etc.) in determining the correct behavior.
In discarding the distinguishment at the lexer level, the information necessary to distinguish between char-type and string-type classification for one char element entities is lost and there is no language rule available outside the lexical context which can reliably be used to reconstruct the lost information.
For literals, AFAICS, the condition length = 1 (or length <= 1 in EP) is all that's required.
Actually, for EP, it is still length = 1. In EP, the length = 0 case is defined to be canonical-string-type. (You get from a null, length = 0, string to a blank char through the blank padding rule in the assignment compatibility rules. I do note that `string_may_be_char' does correctly handle the null string assignment capatibility case.)
In ISO 7185 with the limited constant declaration and string-type capabilities, one probably could deduce (based on the inferences in the constrained constant construction possibilities) the type based on string length outside the lexical context; however, this isn't possible in ISO 10206 since there are a multitude of ways to construct constant one char element entities of string-type which are not also char-types.
So the important difference is between literals and other string constants. In GPC, this is expressed by the flag `PASCAL_TREE_FRESH_CST' (maybe it would be reasonable to rename it to something with `LITERAL'), and that's in fact what I used in the latter fix. So a constant result of `SubStr' etc. now does not have this flag set, and `string_may_be_char' checks this flag.
Thanks for pointing out the `PASCAL_TREE_FRESH_CST' flag. I hadn't picked up on that while wading through the code. If I'm not misundertanding the meaning of it (i.e, the constant string was defined by a source code literal), then the lexical context is sufficiently preserved and therefore enables one to determine whether the (internally represented) string contant is of type char or is of type canonical-string. (Given the flag's useage, my concerns regarding discarding essential lexical information no longer apply.)
Assuming you've gotten all the flag bookkeeping details working correctly, I think the addition of the flag check to `string_may_be_char' yields a good fix for the original Substr problem as well as all the other ordinal type char type versus string-type problems I was seeing.
(With the present GPC limitations in supporting some ISO 10206 constructs for constant declarations, I don't think I can construct working test cases to demonstrate (or check for) problems in this area so this is an observation based upon my inspection the compiler code. I don't profess to have a expert understanding of the compiler internals so I could be mistaken on the Pascal code effects.)
I suppose you mean thing like this (which doesn't work in the next GPC anymore):
program Foo; begin case 'x' of SubStr ('abc', 1, 1) .. 'z': end end.
Something like that only a little more devious in where the defining location is more separated from the using location.
I did come up with a test case which I was able to get through the latest, unpatch, commpiler release without encountering an internal compiler error. Although it contains two errors, the program compiled, ran, and produced an output of 'Fail'.
program CharAndStrTypeTest(input, output);
const kCharA = 'A'; {type char} kAlsoCharA = kCharA; {type char} kStrB = 'B' + ''; {type canonical string}
begin case kStrB of {WRONG} kAlsoCharA: {OK}; kStrB: writeln('Fail'); {WRONG} end; end.
For a more complete coverage of EP's constant string of length one possiblilties, the packed array[1 .. 1] of char case needs to be checked; however, I can't think of a way to get a true Pascal constant defined with that type. For that, you need EP's structured-value-constructors which hasn't been implemented yet in GPC.
Gale Paeper gpaeper@empirenet.com
Gale Paeper wrote:
For literals? String literals of length 1 are just the same as char literals, aren't they?
With a quibble or two, yes. (The quibbling would be over using a more precise terminology to avoid misunderstandings with the apostrophe-image case.)
You mean that '''' is considered of length 1? Sure.
For literals, AFAICS, the condition length = 1 (or length <= 1 in EP) is all that's required.
Actually, for EP, it is still length = 1. In EP, the length = 0 case is defined to be canonical-string-type. (You get from a null, length = 0, string to a blank char through the blank padding rule in the assignment compatibility rules. I do note that `string_may_be_char' does correctly handle the null string assignment capatibility case.)
Indeed, my description was wrong, but the implementation is right (better than vice versa :-).
I did come up with a test case which I was able to get through the latest, unpatch, commpiler release without encountering an internal compiler error. Although it contains two errors, the program compiled, ran, and produced an output of 'Fail'.
program CharAndStrTypeTest(input, output);
const kCharA = 'A'; {type char} kAlsoCharA = kCharA; {type char} kStrB = 'B' + ''; {type canonical string}
begin case kStrB of {WRONG} kAlsoCharA: {OK}; kStrB: writeln('Fail'); {WRONG} end; end.
My current GPC finds both problems now. (gale2[a-c].pas, just to be sure.)
For a more complete coverage of EP's constant string of length one possiblilties, the packed array[1 .. 1] of char case needs to be checked; however, I can't think of a way to get a true Pascal constant defined with that type. For that, you need EP's structured-value-constructors which hasn't been implemented yet in GPC.
If you like, you can write such a test program which we can put in todo/, so it will be tested when those constructors are implemented.
Frank
Hi,
I checked the docs, but could not find any reference to who actually calls the initialization routines like _init_Myunit?
It seems if I "link" with gpcgcc, they get called, but I'm not sure. It also seems that any unit that uses another unit calls the init routines for the used unit in its initialization, is that correct? So they are idempotent presumably?
Thanks, Peter.
Peter N Lewis wrote:
I checked the docs, but could not find any reference to who actually calls the initialization routines like _init_Myunit?
The initializers of importing units (as you noted) and of the main program.
It seems if I "link" with gpcgcc, they get called, but I'm not sure.
It doesn't have anything to do with how it's linked (as long as all parts are linked, but otherwise you get linking errors).
It also seems that any unit that uses another unit calls the init routines for the used unit in its initialization, is that correct? So they are idempotent presumably?
Yes, they use a flag so they return immediately when called again.
Frank
At 3:47 AM +0200 10/7/03, Frank Heckenbach wrote:
Peter N Lewis wrote:
I checked the docs, but could not find any reference to who actually calls the initialization routines like _init_Myunit?
The initializers of importing units (as you noted) and of the main program.
It seems if I "link" with gpcgcc, they get called, but I'm not sure.
It doesn't have anything to do with how it's linked (as long as all parts are linked, but otherwise you get linking errors).
Ok, but in this case I am using gpcgcc to link together all the object files into a bundle, which does not have any concept of a main program. So I'm still not sure the initialization routines are being called - I suspect all the non-"root" units (any that use any others) are being "called" from the initialization routines in the "root" units, but I suspect the init routines in those "root" units are not getting called and so none of them are being called.
Is there an easy way to test that they are being called? ie, what are they actually used for? I tried some structured "constant" variables, but they all seem to be initialized, but maybe that happens with other linker magic.
Thanks, Peter.
Peter N Lewis wrote:
At 3:47 AM +0200 10/7/03, Frank Heckenbach wrote:
Peter N Lewis wrote:
I checked the docs, but could not find any reference to who actually calls the initialization routines like _init_Myunit?
The initializers of importing units (as you noted) and of the main program.
It seems if I "link" with gpcgcc, they get called, but I'm not sure.
It doesn't have anything to do with how it's linked (as long as all parts are linked, but otherwise you get linking errors).
Ok, but in this case I am using gpcgcc
What's gpcgcc actually?
to link together all the object files into a bundle, which does not have any concept of a main program.
OK, so you have the main programm written in C, I suppose. In this case, please look at demos/gpc_c_*
Is there an easy way to test that they are being called?
Put a `WriteLn' or something in an (explicit) unit initializer (which is merged with the automatic initializations).
ie, what are they actually used for? I tried some structured "constant" variables, but they all seem to be initialized, but maybe that happens with other linker magic.
Yes, they're initialized automatically (placed in the initialized data segment). You could declare a string (without explicit initializer) and see if it's capacity gets set, but a `WriteLn' (see above) is probably easier.
Frank
Frank Heckenbach wrote:
Peter N Lewis wrote:
Ok, but in this case I am using gpcgcc
What's gpcgcc actually?
On Mac OS X distributions, gpcgcc is a symbolic link in /usr/bin to the FSF gcc compiler that gets compiled/installed with gpc. It is called gpcgcc to distinguish it from the Apple gcc compiler in /usr/bin.
Regards,
Adriaan van Os
Peter N Lewis wrote:
In the Mac OS, we have a concept of a FourCharCode, which is essentially a UInt32 except that it often has values that are made up of four characters. In Mac OS compilers then, the UNSIGNEDLONG (Cardinal attribute( size = 32 ))is compatible as a special case with constant strings of four characters.
To be more precise, type FourCharCode is treated in a special way, but there are suble differences even between CodeWarrior Pascal, Think Pascal and MPW Pascal (not to mention the CodeWarrior 80x86 cross compiler).
In GPC, we've implemented FourCharCode = packed array[0..3] of char which works for most things, but it is not possible to do a case statement of this type, nor is it possible to cast between UInt32 and FourCharCode (which would provide a workaround, as the case statement could case the value and constants to UInt32 (except GPC does not allow casting in the case constants part, but even that can be worked around).
In fact, casting does work in GPC in nearly all cases. Consider the following test program:
program cast; type word32 = word attribute ( size = 32); word16 = word attribute ( size = 16); rec32 = record r1: word16; r2: word16 end; arr32 = array[ 1..2] of word16; fourcharcode = array[ 1..4] of char; var i: word32; r: rec32; a: arr32; f: fourcharcode; const k1: fourcharcode = 'wxyz'; k2: fourcharcode = 'pqrs'; k3 = 'klmn'; begin i:= 2 + 1 * $10000; writeln( 'i = ', i); r:= rec32( i); writeln( 'r.r1 = ', r.r1, ' r.r2 = ', r.r2); a:= arr32( i); writeln( 'a[ 1 ] = ', a[ 1], ' a[ 2] = ', a[ 2]); writeln;
i:= 4 + 3 * $10000; writeln( 'i = ', i); word32( r):= i; writeln( 'r.r1 = ', r.r1, ' r.r2 = ', r.r2); word32( a):= i; writeln( 'a[ 1 ] = ', a[ 1], ' a[ 2] = ', a[ 2]); writeln;
i:= ORD( 'a') * $1000000 + ORD( 'b') * $10000 + ORD( 'c') * $100 + ORD( 'd') * $1; writeln( 'i = ', i); f:= fourcharcode( i); writeln( 'f = ', f); writeln;
i:= ORD( 'e') * $1000000 + ORD( 'f') * $10000 + ORD( 'g') * $100 + ORD( 'h') * $1; writeln( 'i = ', i); word32( f):= i; writeln( 'f = ', f); writeln;
f:= 'abcd'; writeln( 'f = ', f); i:= word32( f); writeln( 'i = ', i); writeln;
f:= 'efgh'; writeln( 'f = ', f); fourcharcode( i):= f; writeln( 'i = ', i); writeln;
// i:= word32( 'abcd'); not allowed by GPC
fourcharcode( i):= 'abcd'; writeln( 'i = ', i, ' (''abcd'')');
i:= word32( k1); writeln( 'i = ', i, ' (''', k1, ''')'); fourcharcode( i):= k2; writeln( 'i = ', i, ' (''', k2, ''')');
// i:= word32( k3); not allowed by GPC
fourcharcode( i):= k3; writeln( 'i = ', i, ' (''', k3, ''')'); end.
So, all of these type casts work, except the value type cast i:=word32( 'abcd'). This can be worked around by:
- using a variable type cast instead of a value type cast, e.g. fourcharcode( i ):= 'abcd' - declaring a typed constant for the fourcharcode (which is not a bad idea anyway), e.g. i:=word32( k1)
It would be nice if GPC were to allow i:=word32( 'abcd') also (e.g. in --macpascal mode), but this is not a necessity.
So basically, I'm looking for recommendations as to the best way to implement compatibility with this inside GPC, and/or the best extensions to GPC to allow compatibility with this sort of code.
A typical example might be:
const typeBoolean = 'bool'; typeChar = 'TEXT'; typeSInt16 = 'shor'; typeSInt32 = 'long'; typeUInt32 = 'magn'; etc
So it ends up being used as an extensible enumerated type if you like. Anyway, this concept is used all over the Mac OS API, so it is important to have reasonable compatibility for writin Mac OS applications.
Ok, so can anyone think of the best way to get this functionality?
I would recommend:
const typeBoolean : FourCharCode = 'bool'; typeChar : FourCharCode = 'TEXT'; typeSInt16 : FourCharCode = 'shor'; typeSInt32 : FourCharCode = 'long'; typeUInt32 : FourCharCode = 'magn'; etc
Changing FourCharCode = UInt32 is probably better since the concept behind it is as a fixed size low cost use, and it is essentially never used as a string as such.
I am not convinced that changing FourCharCode to UInt32 is a good idea.
Regards,
Adriaan van Os
Adriaan van Os wrote:
So, all of these type casts work, except the value type cast i:=word32( 'abcd'). This can be worked around by:
- using a variable type cast instead of a value type cast, e.g.
fourcharcode( i ):= 'abcd'
- declaring a typed constant for the fourcharcode (which is not a bad
idea anyway), e.g. i:=word32( k1)
Unfortunately, both aren't suitable for case constants, as in Peter's example. (Though I don't know if they're frequently used that way.)
It would be nice if GPC were to allow i:=word32( 'abcd') also (e.g. in --macpascal mode), but this is not a necessity.
This would have to make an implicit decision about endianness, and it might be problematic if in the future GPC will work with Unicode (or any other >8 bit charset), since 4 characters won't fit in a 32 bit integer.
Changing FourCharCode = UInt32 is probably better since the concept behind it is as a fixed size low cost use, and it is essentially never used as a string as such.
I am not convinced that changing FourCharCode to UInt32 is a good idea.
Ordinal constants are the only allowed case constants. So if that's a serious concern, I think there's generally no other way.
Frank
Frank Heckenbach wrote:
Adriaan van Os wrote:
So, all of these type casts work, except the value type cast i:=word32( 'abcd'). This can be worked around by:
- using a variable type cast instead of a value type cast, e.g.
fourcharcode( i ):= 'abcd'
- declaring a typed constant for the fourcharcode (which is not a bad
idea anyway), e.g. i:=word32( k1)
Unfortunately, both aren't suitable for case constants, as in Peter's example. (Though I don't know if they're frequently used that way.)
To speak for myself, no. I looked into a typical Mac OS program with 300.000 lines of code. It had 443 case statements, but none of them used FourCharCodes as case constants.
It would be nice if GPC were to allow i:=word32( 'abcd') also (e.g. in --macpascal mode), but this is not a necessity.
This would have to make an implicit decision about endianness,
Can't the compiler just look at __BYTES_BIG_ENDIAN__ and __BYTES_LITTLE_ENDIAN__ ?
and it might be problematic if in the future GPC will work with Unicode (or any other >8 bit charset), since 4 characters won't fit in a 32 bit integer.
That would give rise to some very interesting problems anyway. For binary compatibility with existing API's (on many platforms), we will always need some 8-bit char type. Then, how do you indicate that a string constant is composed of either 8-bit or (e.g.) 16-bit characters ?
Changing FourCharCode = UInt32 is probably better since the concept behind it is as a fixed size low cost use, and it is essentially never used as a string as such.
I am not convinced that changing FourCharCode to UInt32 is a good idea.
Ordinal constants are the only allowed case constants. So if that's a serious concern, I think there's generally no other way.
I believe the limitation is acceptable, at least not serious enough to come up with tricks (that make source code more obscure). The case statement can be rewritten to an if statement.
Regards,
Adriaan van Os
Adriaan van Os wrote:
Frank Heckenbach wrote:
It would be nice if GPC were to allow i:=word32( 'abcd') also (e.g. in --macpascal mode), but this is not a necessity.
This would have to make an implicit decision about endianness,
Can't the compiler just look at __BYTES_BIG_ENDIAN__ and __BYTES_LITTLE_ENDIAN__ ?
Sure. But I mean we'd have to make a decision first how to handle it. Should it be always big-endian (like Peter's macro), or use the actual target's endianness (so it gets the same memory layout as the character array)? With the macro, this decision is left to the programmer. (Though it might not actually matter if the same conversion is used for all values.)
and it might be problematic if in the future GPC will work with Unicode (or any other >8 bit charset), since 4 characters won't fit in a 32 bit integer.
That would give rise to some very interesting problems anyway. For binary compatibility with existing API's (on many platforms), we will always need some 8-bit char type. Then, how do you indicate that a string constant is composed of either 8-bit or (e.g.) 16-bit characters ?
I haven't spent many thoughts about it (and don't plan to implement it soon). Perhaps just another option (but I'm not sure yet of the implications).
Frank
Unfortunately, both aren't suitable for case constants, as in Peter's example. (Though I don't know if they're frequently used that way.)
To speak for myself, no. I looked into a typical Mac OS program with 300.000 lines of code. It had 443 case statements, but none of them used FourCharCodes as case constants.
They become a lot more prevalent in Mac OS X carbon events code and NIB, as you use them for all the commands and status displays and controls and such.
I believe the limitation is acceptable, at least not serious enough to come up with tricks (that make source code more obscure). The case statement can be rewritten to an if statement.
it can, but a case statement is a lot more clear, and also has the added benefit of efficiency and checking for duplicated case elements.
{$define UInt32(s) ((Ord (substr((s),1,1)) shl 24) or (Ord (substr((s),2,1)) shl 16) or (Ord (substr((s),3,1)) shl 8) or Ord (substr((s),4,1))}
should work for s as a constant-string-literal or as a constant-identifier declared as a string-literal (presuming, of course, the string has at least four characters).
Ah, good idea! This patch should fix the crash and allow `SubStr' (and `Copy') with constant arguments in constant expressions (gale1.pas). (As usual, I've made other changes meanwhile, so I hope it will fit the 20030507 sources.)
This is good, but it does have the disadvantage/risk that it uses the input parameter four times and thus is at risk if the input parameter is a function call (which at best will be called four times and at worst will cause side effects which could introduce hideous bugs).
I don't suppose there is any way of making that macro work but only access the input parameter once? Peter.
Peter N Lewis wrote:
[snip]
{$define UInt32(s) ((Ord (substr((s),1,1)) shl 24) or (Ord (substr((s),2,1)) shl 16) or (Ord (substr((s),3,1)) shl 8) or Ord (substr((s),4,1))}
should work for s as a constant-string-literal or as a constant-identifier declared as a string-literal (presuming, of course, the string has at least four characters).
Ah, good idea! This patch should fix the crash and allow `SubStr' (and `Copy') with constant arguments in constant expressions (gale1.pas). (As usual, I've made other changes meanwhile, so I hope it will fit the 20030507 sources.)
This is good, but it does have the disadvantage/risk that it uses the input parameter four times and thus is at risk if the input parameter is a function call (which at best will be called four times and at worst will cause side effects which could introduce hideous bugs).
Provided the FourCharCode type in MacTypes.pas is changed to be a UInt32 type, I don't think there will be much, if any, need to use the macro on function call results. A reasonably sane function declaration which returns a FourCharCode result would use either MacTypes's FourCharCode type or one derived from it in declaring the function result type; therefore, the results would already be in UInt32 compatible type form and the macro wouldn't need to be used (and it wouldn't work even if one tried to use it).
When one factors in Universal Interfaces into the useage senario, I think the main, and perhap only, useage of the macro will be pretty similiar to Apple's useage of the FOUR_CHAR_CODE macro in the Universal Interfaces C headers. The macro is used just with constant string literals (to fix endian issues on the C side) and once the string literals have been macro fixed there is no need to use the macro for anything else. Given the FourCharCode type is made UInt32, the intertwined type declarations and useages in the Universal Interfaces, and compiler enforcement of type compatibility, I think there will be a very strong inducement to apply the macro to all FourCharCode constant string literals at the point where they first appear to get them converted to the UInt32 type as soon as possible and once converted kept in the UInt32 type domain. If one doesn't adopt that practice, you will be constantly beating your head against the "incompatible types error" wall. (There probably will be a need to do FourCharCode/UInt32 conversions in some instances at user I/O boundaries but for most useages in the execution environment it would be best to get the four char strings converted to the UInt32 type domain as soon as possible and then leave them in that type domain. Therefore, I think it would be a good idea to apply the macro fix to all the FourCharCode constants declared in the GPCInterfaces version of the Universal Interfaces.)
I don't suppose there is any way of making that macro work but only access the input parameter once?
The only thing I can think of is using the "declaring statement" GPC extension to create a temporary variable and then use the temporary for repeated accesses. Unfortunately, that would preclude using the macro in expressions, case constants, and parameters. In other words, in order to avoid a rarely occuring (or so I believe) worst case senario of a function call macro argument, you won't be able to use the macro in all the placees where you really have a pressing, practical need to use it.
Gale Paeper gpaeper@empirenet.com
Gale Paeper wrote:
Peter N Lewis wrote:
This is good, but it does have the disadvantage/risk that it uses the input parameter four times and thus is at risk if the input parameter is a function call (which at best will be called four times and at worst will cause side effects which could introduce hideous bugs).
Provided the FourCharCode type in MacTypes.pas is changed to be a UInt32 type, I don't think there will be much, if any, need to use the macro on function call results.
I hope so.
If there in fact is (should be) no need to, then it may be useful to protect the macro against being used with a non-constant argument.
Peter and I have discussed by private mail that this may already suffice. (Of course, if there's a rare case where it's needed for a non-constant value, one could write a function for it. It would have to have a different name, though. But if the compiler complained if the macro is used with a non-constant argument or the function used in a constant context, this should be no big deal.)
So, how to ensure that? One could include into the macro a builtin function call which requires a constant argument. The only ones that do (AFAIK) are ReturnAddress/FrameAddress. The following actually seems to work:
... + 0 * PtrInt (ReturnAddress (Ord (Length (s) = MaxInt)))
(Note: The condition in `Ord' must "almost" always be false, without the compiler being able to deduce it for a non-constant because some platforms don't accept a nonzero argument to ReturnAddress.)
Of course, that's very kludgy. So we might want to add a new builtin function for that.
One candidate would be a compile-time assert function. This would also cover another open issue (not complete open since compile-time assertions can be achieved by things like `const Assert1 = 1 / Ord (Condition);' which will produce a compile-time error if Condition is false; but that works only if a constant declaration is possible, i.e. not with expressions).
A compile-time assert function would make this more elegant and work for the above case as well. It could work as follows:
- If the (first) argument is not a compile-time constant or does not have the value true, it gives a compile-time error. (So one can check another condition here; in the macro, the obvious thing to check would be Length (s) = 4.)
Otherwise, it yields true.
- Perhaps (optionally?) a second argument which is returned rather than true, so one could do:
{ User can change these. m must be >= n. } const m = 100; n = 10;
var a: array [1 .. Assertion (m >= n, m)] of Integer;
- Name of the function? `Assert' is already taken for runtime assertions. I've used `Assertion' above, but it doesn't really make the difference clear.
Adriaan van Os wrote:
Peter N Lewis wrote:
Unfortunately, both aren't suitable for case constants, as in Peter's example. (Though I don't know if they're frequently used that way.)
To speak for myself, no. I looked into a typical Mac OS program with 300.000 lines of code. It had 443 case statements, but none of them used FourCharCodes as case constants.
They become a lot more prevalent in Mac OS X carbon events code and NIB, as you use them for all the commands and status displays and controls and such.
In the CarbonEvents.pas unit, we could write something like this:
const kEventParamMenuRef : FourCharCode = 'menu'; kEventParamMenuRefWord = Word32( kEventParamMenuRef);
Now, you can use "kEventParamMenuRefWord" in case statements.
Oops. This actually shouldn't work I think. (kEventParamMenuRef is not ordinal, so a value type cast doesn't apply, and a variable type cast shouldn't be usable for a constant value.) So I'll probably forbid it in the next version.
Sure. But I mean we'd have to make a decision first how to handle it. Should it be always big-endian (like Peter's macro), or use the actual target's endianness (so it gets the same memory layout as the character array)? With the macro, this decision is left to the programmer. (Though it might not actually matter if the same conversion is used for all values.)
I would say, the first character of the character array should always come first in memory, which implies that the target's endianness determines the word value.
In this case the macro needs to use the endianness conditionals.
Frank
Frank Heckenbach wrote:
In the CarbonEvents.pas unit, we could write something like this:
const kEventParamMenuRef : FourCharCode = 'menu'; kEventParamMenuRefWord = Word32( kEventParamMenuRef);
Now, you can use "kEventParamMenuRefWord" in case statements.
Oops. This actually shouldn't work I think. (kEventParamMenuRef is not ordinal, so a value type cast doesn't apply, and a variable type cast shouldn't be usable for a constant value.) So I'll probably forbid it in the next version.
It doesn't make the task of working with FourCharCode types easier. What about a built-in function (instead of a macro) that converts a FourCharCode type into a 32-bit Cardinal, e.g.
Const kEventParamMenuRef = FourCharOrd( 'menu');
or
Const kEventParamMenuRef = Ord4( 'menu');
Regards,
Adriaan van Os
Peter N Lewis wrote:
Unfortunately, both aren't suitable for case constants, as in Peter's example. (Though I don't know if they're frequently used that way.)
To speak for myself, no. I looked into a typical Mac OS program with 300.000 lines of code. It had 443 case statements, but none of them used FourCharCodes as case constants.
They become a lot more prevalent in Mac OS X carbon events code and NIB, as you use them for all the commands and status displays and controls and such.
In the CarbonEvents.pas unit, we could write something like this:
const kEventParamMenuRef : FourCharCode = 'menu'; kEventParamMenuRefWord = Word32( kEventParamMenuRef);
Now, you can use "kEventParamMenuRefWord" in case statements.
The alternatives are:
(1) Use a macro, but ... whatever macro you come up with, GPC won't propagate it to other units. So, how to define it once ? It must be available at the application level. Besides, macro's shouldn't be considered a regular programming concept in Pascal programs.
(2) Allow the type cast:
const kEventParamMenuRef = Word32( 'menu');
Frank Heckenbach wrote:
It would be nice if GPC were to allow i:=word32( 'abcd') also (e.g. in --macpascal mode), but this is not a necessity.
This would have to make an implicit decision about endianness,
Can't the compiler just look at __BYTES_BIG_ENDIAN__ and __BYTES_LITTLE_ENDIAN__ ?
Sure. But I mean we'd have to make a decision first how to handle it. Should it be always big-endian (like Peter's macro), or use the actual target's endianness (so it gets the same memory layout as the character array)? With the macro, this decision is left to the programmer. (Though it might not actually matter if the same conversion is used for all values.)
I would say, the first character of the character array should always come first in memory, which implies that the target's endianness determines the word value.
Regards,
Adriaan van Os
At 12:31 PM +0200 25/6/03, Adriaan van Os wrote:
In fact, casting does work in GPC in nearly all cases. Consider the following test program:
const k1: fourcharcode = 'wxyz'; k3 = 'klmn';
i:= word32( k1); writeln( 'i = ', i, ' (''', k1, ''')'); fourcharcode( i):= k2; writeln( 'i = ', i, ' (''', k2, ''')');
// i:= word32( k3); not allowed by GPC
The problem with this is that k1 is not actually a constant - it's really a variable that you can't modify which is quite significantly different no matter how similar the declaration might be. At least that is my understanding.
At 12:05 PM -0700 25/6/03, Gale Paeper wrote:
{$define UInt32(s) ((Ord (substr((s),1,1)) shl 24) or (Ord (substr((s),2,1)) shl 16) or (Ord (substr((s),3,1)) shl 8) or Ord (substr((s),4,1))}
At 12:34 AM -0700 26/6/03, Gale Paeper wrote:
ord(substr((s),1,1)[1])
This made me wonder if word32(substr(s,1,4)[1..4]) would work, but it does not either, the same as word32('abcd') doesn't work (as Frank says, there is an implicit endianness issue in that cast). Perhaps this could be allowed with flag akin to the $X?
Speaking of $X, is there a compiler directive that allows pointer arithmetic without all the other $X stuff (whatever that might be)? Currently I use
{$no-typed-address,X+,no-ignore-function-results}
but I worry about what other things like ignore-function-results $X+ ,might be "giving" me.
Thanks, Peter.
Is there any compiler directives to control pointer arithmetic directly, or to turn on/off the cast-align warning "cast increases required alignment of target type"?
Currently I can use the $X to allows pointer arithmetic, but what else does that allow? Currently I use:
{$no-typed-address,X+,no-ignore-function-results}
but I worry about what other things like ignore-function-results $X+ ,might be "giving" me.
Also, I'd like to be able to disable the cast-align warning in specific places, but I couldn't find a compiler directive to control that - is there any way to feed them through to gcc?
Thanks, Peter.
Peter N Lewis wrote:
Is there any compiler directives to control pointer arithmetic directly, or to turn on/off the cast-align warning "cast increases required alignment of target type"?
Currently I can use the $X to allows pointer arithmetic, but what else does that allow? Currently I use:
{$no-typed-address,X+,no-ignore-function-results}
but I worry about what other things like ignore-function-results $X+ ,might be "giving" me.
In the next release, `--extended-syntax' will be equivalent to
--ignore-function-results --pointer-arithmetic --cstrings-as-strings -Wno-absolute
(which is what it did so far, now there will be individual options).
Also, I'd like to be able to disable the cast-align warning in specific places, but I couldn't find a compiler directive to control that -
Other than the global `{$W-}' (or working around it by extra pointers or so), no. (And I don't think there should be one since doing so is a real problem.)
is there any way to feed them through to gcc?
All options that the C frontend understands are passed to it when compiling C files from GPC.
Frank