GPC supports a number of keywords that do not belong to all (or any ;-) known Pascal standards/dialects. To relieve the resulting problems, there are currently 3 mechanisms:
- If any dialect option (`--extended-pascal' etc.) is used, all keywords not belonging to this standard are deactivated completely.
- Individual keywords can be disabled using `--disable-keyword'.
- Some keywords are recognized only depending on context, so compiling a program that uses them as identifiers is still possible without any of the above options.
However, the 3rd approach only works in some situations (and can never work perfectly, since at some points there would simply be syntactic ambiguities between certain keywords and identifiers).
I think for some words it would be easier to recognize them as keywords only if they have no current meaning (as identifiers) in the program, rather than turning them on and off in certain syntactic places (which is quite error-prone and probably still has some subtle bugs). I'm not sure if this approach will work for all problematic keywords, but at least for some.
So this will still work:
program Foo;
var Foo: static Integer;
begin end.
Also this:
program Foo;
procedure Bar; var Static: Integer; begin end;
var Foo: static Integer;
begin end.
But not this (which works now):
program Foo;
var Static: Integer; Foo: static Integer;
begin end.
Do you think this would a problem? It should not affect the ability to compile, say, EP code without `--extended-pascal' *as far as* that's possible now, since EP code can't use `static' as a directive (and it can't use as identifiers what are keywords in EP).
Frank
Frank Heckenbach wrote:
Do you think this would a problem? It should not affect the ability to compile, say, EP code without `--extended-pascal' *as far as* that's possible now, since EP code can't use `static' as a directive (and it can't use as identifiers what are keywords in EP).
Okay with me. I usually stay within one specified mode with a project, and avoid using any keywords as identifiers.
Yours,
Markus
Frank Heckenbach wrote:
Do you think this would a problem? It should not affect the ability to compile, say, EP code without `--extended-pascal' *as far as* that's possible now, since EP code can't use `static' as a directive (and it can't use as identifiers what are keywords in EP).
Not for me.
Regards,
Adriaan van Os
Adriaan van Os wrote:
Frank Heckenbach wrote:
Do you think this would a problem? It should not affect the ability to compile, say, EP code without `--extended-pascal' *as far as* that's possible now, since EP code can't use `static' as a directive (and it can't use as identifiers what are keywords in EP).
Not for me.
I meant to say "Not a problem for me", of course.
Regards,
Adriaan van Os
At 2:36 PM +0100 25/2/03, Frank Heckenbach wrote:
GPC supports a number of keywords that do not belong to all (or any ;-) known Pascal standards/dialects. To relieve the resulting problems, there are currently 3 mechanisms:
If any dialect option (`--extended-pascal' etc.) is used, all keywords not belonging to this standard are deactivated completely.
Individual keywords can be disabled using `--disable-keyword'.
Some keywords are recognized only depending on context, so compiling a program that uses them as identifiers is still possible without any of the above options.
However, the 3rd approach only works in some situations (and can never work perfectly, since at some points there would simply be syntactic ambiguities between certain keywords and identifiers).
While it (probably!) won't affect me, I prefer that people use the second to make an explicit statement about what they are trying to achieve, although it'd be nice to have it as a compiler directive so that it can be placed within the source as so to make the source self-documenting in this respect, e.g.
{$pascal-dialect extended-pascal}
{$disable-keyword operator}
While others might argue that this is not portable, at least it reminds you what you intended! (The disadvantage is that it'd need to be in every source file and you might get issues if the dialect started differing in the different source files!)
I think the whole idea of interpreting the meaning of an identifier in context is a shade dodgy (see my other post today mentioning simple rules for determining user identifier and predefined keyword conflicts). Although clunky in many ways, I thought one of the original things in Pascal was that keywords just were keywords, no matter where they were found. Magically switching off a keyword in a given context seems confusing. This is why I suggested the use of quotes in the external case (to make the keywords explicitly reduced to the scope of the quote).
That said, there must be pragmatic cases where it'd be nice to avoid conflicts in legacy code (like mine! :-) ). Perhaps this should of thing by default should _not_ take place unless the user explicitly asks for it to happen to specific identifiers, e.g.
{$allow-user-identifier static}
(Note this doesn't disable the keyword)
and/or
{$context-dependent static}
for the really sticky cases only (and only if anyone ever implements it!).
I think for some words it would be easier to recognize them as keywords only if they have no current meaning (as identifiers) in the program, rather than turning them on and off in certain syntactic places (which is quite error-prone and probably still has some subtle bugs). I'm not sure if this approach will work for all problematic keywords, but at least for some.
So this will still work:
program Foo;
var Foo: static Integer;
begin end.
Also this:
program Foo;
procedure Bar; var Static: Integer; begin end;
var Foo: static Integer;
begin end.
But not this (which works now):
program Foo;
var Static: Integer; Foo: static Integer;
begin end.
Do you think this would a problem? It should not affect the ability to compile, say, EP code without `--extended-pascal' *as far as* that's possible now, since EP code can't use `static' as a directive (and it can't use as identifiers what are keywords in EP).
Frank
-- Frank Heckenbach, frank@g-n-u.de, http://fjf.gnu.de/, 7977168E GPC To-Do list, latest features, fixed bugs: http://www.gnu-pascal.de/todo.html GPC download signing key: 51FF C1F0 1A77 C6C2 4482 4DDC 117A 9773 7F88 1707
Grant Jacobs wrote:
While it (probably!) won't affect me, I prefer that people use the second to make an explicit statement about what they are trying to achieve, although it'd be nice to have it as a compiler directive so that it can be placed within the source as so to make the source self-documenting in this respect, e.g.
{$pascal-dialect extended-pascal}
{$extended-pascal}
{$disable-keyword operator}
This does already work (though vice versa, since once EP mode is on, compiler directives are forbidden -- well, warned only, actually).
(And this particular combination is redundant, since `operator' is no EP keyword (only PXSC), so it wouldn't be active in EP mode, anyway.)
While others might argue that this is not portable, at least it reminds you what you intended!
Sure. If you want to be portable to, say, EP or BP, just use this dialect option (globally) and all keywords are set automatically.
The whole issue mainly comes up when using no dialect option.
(The disadvantage is that it'd need to be in every source file and you might get issues if the dialect started differing in the different source files!)
This should be no problem. Even turning keywords on/off or switching dialects within one source file should work (though it isn't recommended at all).
I think the whole idea of interpreting the meaning of an identifier in context is a shade dodgy (see my other post today mentioning simple rules for determining user identifier and predefined keyword conflicts). Although clunky in many ways, I thought one of the original things in Pascal was that keywords just were keywords, no matter where they were found.
Well, I guess this ended already with the introduction of EP:
: ISO/IEC 10206:1990(E) Annex B (Informative) : : Incompatibilities with Pascal standards : : Programs that conform to the existing Pascal standards ISO 7185, : BS 6192, and ANSI/IEEE770X3.97- 1983 may need to have some : identifiers changed in them because of the addition of new : wordsymbols in Extended Pascal. The new word-symbols that have : been added to Extended Pascal are: : : and_then only protected bindable or_else qualified export : otherwise restricted import pow value module
The most important one is surely `value' which is often used as an identifier in many CP (and also BP etc.) programs.
Talking of the default GPC mode, it must be EP compatible, so `value' must be a keyword. If we make it so unconditionally, it will break much existing CP and BP code, requiring it to be changed or compiled with dialect options (which we generally don't recommend, since many useful extensions aren't available then).
So, making them conditional keywords helps a lot there. As you said, making this dependent on the context is problematic. That's why I suggested the new approach: It is an identifier if a declaration of this name exists, and a keyword otherwise. -- Well, almost: Of course, it must also be an identifier in those places where new declarations are made (otherwise it would be impossible to make such a declaration). So it is still context dependent, but much less so than before (and, as it happens, support for this way already exists in GPC's parser, since it's the same way that some special predefined identifiers are handled, e.g. `WriteLn' with its special syntax which loses its special properties when redefining it, and after it has been redefined, i.e. just the same I proposed for the keywords). This approach will fail with keywords that would conflict in those places where new identifiers can be defined. I'll have to check this, but I hope it won't be too many ...
BTW, as I said the other day, I'm quite critical of the addition of new keywords in general -- e.g., instead of `value' for type initialization it seems to be possible to use `:=' (an existing symbol, and no word at all) like VAX Pascal apparently does. In other cases, a combination of existing keywords can do nicely (such as EP does in the form of `to begin do' and `to end do' -- instead of introducing new keywords there which Delphi does; OTOH, EP introduces `and_then' and `or_else' instead of using `and then' and `or else', which GPC also allows, and which don't cause any conflicts). So I'd like the ideal original world where "keywords just were keywords", but unfortunately that's out of our control if we want to be compatible ...
That said, there must be pragmatic cases where it'd be nice to avoid conflicts in legacy code (like mine! :-) ). Perhaps this should of thing by default should _not_ take place unless the user explicitly asks for it to happen to specific identifiers, e.g.
{$allow-user-identifier static}
(Note this doesn't disable the keyword)
and/or
{$context-dependent static}
for the really sticky cases only (and only if anyone ever implements it!).
I don't see how it would improve anything -- one still would need a special option to compile, say CP/BP code which uses `Value' as an identifier.
Frank
I wrote:
: GPC supports a number of keywords that do not belong to all (or any : ;-) known Pascal standards/dialects. To relieve the resulting : problems, there are currently 3 mechanisms: : : - If any dialect option (`--extended-pascal' etc.) is used, all : keywords not belonging to this standard are deactivated : completely. : : - Individual keywords can be disabled using `--disable-keyword'. : : - Some keywords are recognized only depending on context, so : compiling a program that uses them as identifiers is still : possible without any of the above options. : : However, the 3rd approach only works in some situations (and can : never work perfectly, since at some points there would simply be : syntactic ambiguities between certain keywords and identifiers). : : I think for some words it would be easier to recognize them as : keywords only if they have no current meaning (as identifiers) in : the program, rather than turning them on and off in certain : syntactic places (which is quite error-prone and probably still has : some subtle bugs). I'm not sure if this approach will work for all : problematic keywords, but at least for some. : : So this will still work:
[Examples changed since my original mail due to the new `attribute' syntax, but that's beside the point of this mail.]
: program Foo; : : var : Foo: Integer; attribute (static); : : begin : end. : : Also this: : : program Foo; : : procedure Bar; : var attribute: Integer; : begin : end; : : var : Foo: Integer; attribute (static); : : begin : end. : : But not this: : : program Foo; : : var : attribute: Integer; : Foo: Integer; attribute (static); : : begin : end. : : It should not affect the ability to compile, say, EP code without : `--extended-pascal' *as far as* that was possible before, since EP : code can't use `attribute' directives (and it can't use as : identifiers what are keywords in EP).
I've done this now and analyzed all problematic keywords. Fortunately, most of them could be completely resolved this way. In addition, some of them can even be used as "keywords" (directives) and identifiers in parallel, in particular `forward' (which the Pascal standards require) and `near' and `far' (which BP seems to do, though its documentation says otherwise).
There are only two exceptions:
- `Operator' can't be used as a type, untyped constant or exported interface (unless it's disabled as a keyword explicitly or by dialect options). This is because of the following conflict:
type Foo = record end; Operator = (a, b); { enum type }
vs.
type Foo = record end;
operator = (a, b: Foo) c: Foo;
This is not a complete ambiguity, but requires 6 tokens look-ahead to decide whether `operator' is a keyword. That's way too much (IMHO), so since the operator `=' should be definable, we have to make the restriction as stated.
- The following keywords can't be used immediately after an `import' part: uses, implementation, operator, constructor, destructor. This is because of conflicts such as the following:
import Foo; Uses only (a); { import only `a' from `Uses' }
vs.
import Foo;
uses Only (a); { import `a' from `Only' }
All of them are mixes of different standards (`import' is EP, the other keywords are BP, OP and PXSC -- EP's meaning of `implementation' isn't affected since it can't occur there, anyway).
So the decision was between disallowing the keywords there, or forbidding those identifiers as module names in `import'. I think the former restriction is less severe -- it can always be resolved by putting some other declaration in between (or by using `uses' instead of `import').
Frank
At 10:29 AM +0100 16/3/03, Frank Heckenbach wrote:
I've done this now and analyzed all problematic keywords. Fortunately, most of them could be completely resolved this way. In addition, some of them can even be used as "keywords" (directives) and identifiers in parallel, in particular `forward' (which the Pascal standards require) and `near' and `far' (which BP seems to do, though its documentation says otherwise).
There are only two exceptions:
`Operator' can't be used as a type, untyped constant or exported interface (unless it's disabled as a keyword explicitly or by dialect options). This is because of the following conflict:
type Foo = record end; Operator = (a, b); { enum type }
vs.
type Foo = record end;
operator = (a, b: Foo) c: Foo;
This is not a complete ambiguity, but requires 6 tokens look-ahead to decide whether `operator' is a keyword. That's way too much (IMHO), so since the operator `=' should be definable, we have to make the restriction as stated.
Just out of interest: the use of the word 'operator' in the code I was porting was within a type definition, e.g.
half_exprs = record negate : boolean; operator : operators; case value_type : value_types of int_value : ( int : integer ); string_value : ( str : strings ) end;
Is it easy to reduce the conflict to just when operator is defined as a "plain" type (as in your example) rather than within a record?
A possible further reduction, although a bit ugly, would be that if the user wants to use "operator = ..." as a type, they must place it as the first type definition, e.g.
type Operator = (a, b); Foo = record end;
These two would cover many, but not all cases.
As an aside, it makes me think that the const, type and var section probably could have been designed to have an "end" keyword to avoid this sort of thing (when Pascal was first designed, that is); this isn't a suggestion, just a idle thought. Something like:
const-begin <const declarations> const-end
type-begin <type declarations> type-end
This way later development of new things like operator = ... wouldn't impact on the declaration sections, as they'd have one unique end-point "for all time".
But that's off topic a bit...
Grant
Grant Jacobs wrote:
... snip ...
As an aside, it makes me think that the const, type and var section probably could have been designed to have an "end" keyword to avoid this sort of thing (when Pascal was first designed, that is); this isn't a suggestion, just a idle thought. Something like:
const-begin <const declarations> const-end type-begin <type declarations> type-end
This way later development of new things like operator = ... wouldn't impact on the declaration sections, as they'd have one unique end-point "for all time".
They do, at least in standard Pascal. Parsing is something like:
WHILE NOT (nextsym IN [typesy, varsy, procsy, funcsy, beginsy]) DO BEGIN (* parse a constant *) END;
and an extension, to avoid the order dependance of ISO 7185, is to add 'constsy' to the above set of terminators.
At 6:47 PM -0500 16/3/03, CBFalconer wrote:
Grant Jacobs wrote:
... snip ...
As an aside, it makes me think that the const, type and var section probably could have been designed to have an "end" keyword to avoid this sort of thing (when Pascal was first designed, that is); this isn't a suggestion, just a idle thought. Something like:
const-begin <const declarations> const-end type-begin <type declarations> type-end
This way later development of new things like operator = ... wouldn't impact on the declaration sections, as they'd have one unique end-point "for all time".
They do, at least in standard Pascal. Parsing is something like:
WHILE NOT (nextsym IN [typesy, varsy, procsy, funcsy, beginsy]) DO BEGIN (* parse a constant *) END; and an extension, to avoid the order dependance of ISO 7185, is to add 'constsy' to the above set of terminators.
What the operator = ... ambiguity appears to be doing (to me!) is adding to the list in the in above. If you add an end symbol, the list becomes just one item: the end symbol and nothing else,. You don't have to know how other block/sections start, just how your current block/section ends, e.g.
while not ( nextsym = blockendsym ) do begin (* parse elements of that type of block *) end ;
This would make the const declaration section independent of whatever else evolves in other sections. So if you were to add other things like operator = ..., there would be no need to revise the parsing of the preceding section. (Basically, the end-set for a section becomes a single fixed item, rather than a list of start-sets for whatever sections could follow.)
(I hope understand the 'operator = ...' construct correctly, as I've never used it. In Frank's example it appears to occur in the function/procedure declaration section and thus would extend the list in the loop guard above.)
All of this is sort-of off topic... but fun :-) Excuse my degression...
Grant
Grant Jacobs wrote:
At 10:29 AM +0100 16/3/03, Frank Heckenbach wrote:
I've done this now and analyzed all problematic keywords. Fortunately, most of them could be completely resolved this way. In addition, some of them can even be used as "keywords" (directives) and identifiers in parallel, in particular `forward' (which the Pascal standards require) and `near' and `far' (which BP seems to do, though its documentation says otherwise).
There are only two exceptions:
`Operator' can't be used as a type, untyped constant or exported interface (unless it's disabled as a keyword explicitly or by dialect options). This is because of the following conflict:
type Foo = record end; Operator = (a, b); { enum type }
vs.
type Foo = record end;
operator = (a, b: Foo) c: Foo;
This is not a complete ambiguity, but requires 6 tokens look-ahead to decide whether `operator' is a keyword. That's way too much (IMHO), so since the operator `=' should be definable, we have to make the restriction as stated.
Just out of interest: the use of the word 'operator' in the code I was porting was within a type definition, e.g.
half_exprs = record negate : boolean; operator : operators; case value_type : value_types of int_value : ( int : integer ); string_value : ( str : strings ) end;
Is it easy to reduce the conflict to just when operator is defined as a "plain" type (as in your example) rather than within a record?
It would not be easy because it would make the lexing dependent on syntactic context again (which I just tried to get rid of as much as possible because it made the grammar quite fragile).
However, that's not the problematic case at all. Only `operator' as a "top-level" type is problematic, i.e. when it's followed by `='. Since there is no `:' operator that could be overloaded, this record example will just work.
Actually, there's one case where "top-level" types are not followed by `=', viz schemata. This could be a conflict with keywords that can be followed by `(', in particular `attribute':
type Foo = Integer; attribute (aligned);
vs.
type Foo = Integer; Attribute (Aligned: Integer) = Integer;
This is no conflict only because we don't allow type attributes at all currently. But we might want to -- GCC does it, and it was suggested as a solution for the `Integer (16)' problem. When we do this, this conflict will arise.
One solution would be to omit the `;' before attribute:
type Foo = Integer attribute (aligned);
This seems to be conflict-free, and might even work in variable declarations etc.:
var Foo: Integer attribute (aligned); attribute (static);
where the first attribute would be of the type, and the second one of the variable.
A possible further reduction, although a bit ugly, would be that if the user wants to use "operator = ..." as a type, they must place it as the first type definition, e.g.
type Operator = (a, b); Foo = record end;
These two would cover many, but not all cases.
This would be possible, but quite a bit more difficult -- either making the lexing context-dependent (see above) or using a difficult grammar rule (that includes the `operator' token) for the first type in a block.
As an aside, it makes me think that the const, type and var section probably could have been designed to have an "end" keyword to avoid this sort of thing (when Pascal was first designed, that is); this isn't a suggestion, just a idle thought.
Something like:
const-begin <const declarations> const-end type-begin <type declarations> type-end
This way later development of new things like operator = ... wouldn't impact on the declaration sections, as they'd have one unique end-point "for all time".
I agree. Or use something else than `;' to separate the items so the next `;' (outside of records etc.) means the end. That's a case where I prefer the `uses' syntax (with `,') over EP's `import' (with `;' in between).
But given that this can't be changed, maybe the designers of the `operator' syntax should have thought of it. (Maybe `function operator'?) But that's too late now as well ...
CBFalconer wrote:
They do, at least in standard Pascal. Parsing is something like:
WHILE NOT (nextsym IN [typesy, varsy, procsy, funcsy, beginsy]) DO BEGIN (* parse a constant *) END;
But that's exactly the difference. They don't have their end-points defined by themselves, but only by the context. While this doesn't create a conflict by itself (i.e., within the standard Pascals), it does make the syntax more fragile WRT extensions.
And it does not necessarily only apply to non-standard extensions; the OOE (which is a standard draft after all) might have the mentioned problem with `import' and `constructor'/`destructor' (I'd have to check in detail). I.e., you either have to do more than one token look-ahead or make `constructor' and `destructor' strictly reserved words which breaks valid EP programs that use them as identifiers.
Grant Jacobs wrote:
(I hope understand the 'operator = ...' construct correctly, as I've never used it. In Frank's example it appears to occur in the function/procedure declaration section and thus would extend the list in the loop guard above.)
Yes, that's the problem.
All of this is sort-of off topic... but fun :-) Excuse my degression...
Yeah, it's fun ... until you actually try to implement it. ;-)
Frank