Frank wrote:
As you probably know, GPC currently converts all identifiers to first letter upper-case/rest lower-case. Since this is often not nice, I plan to change this. This would affect at least the following things:
It is not clear to me what change you propose.
You probably mean (because this is easiest to implement) to (optionally?) drop the conversion altogether and consequently have case-sensitive identifiers. This is contrary to Pascal standards, but has some advantages in the C/Unix-dominated tool world.
Of course, I would NOT recommend
program Example1;
const MIXEDCASE = True;
var MixedCase: Integer; mixedCase: Real;
ALTERNATIVELY, it might be possible to convert all occurences of the same identifier (modulo case) to the casing found on the defining occurrence. Thus
program Example2;
var MixedCase: Integer;
...
readln ( mixedCase ) ; writeln ( mixedcase )
would work properly and all external handling would be via the identifier MixedCase.
This also seems easy to implement (at the expense of a slightly more expense compare for lookup) and more in line with Pascal:
When creating a symbol-table entry you store the casing as found. When looking up a name you compare names modulo case (e.g. convert both comparands to lower case), when you get a hit you use the actual casing found in the symbol table.
This makes Example1 illegal and makes Example2 work as expected.
While I'm at it, it might be easy to add a warning if an identifier is used with varying case (something like `var Foo: Integer; [...] WriteLn (FOO)').
Such an option would be quite helpful, even if you do not make changes to the case handling.
- Any ideas for the name of the option? (`-Widentifier-case'?)
Seems fine.
- Default? (I suppose off, though I think I myself would prefer on.)
I would prefer on, because we have come to know that case variations of the same identifier are confusing.
- Should it work across units/modules? Or should it be a tri-state
option (never/current file/global)?
Across units/modules would be best, but I can imagine that with legacy code it might be helpful to restrict the scope of the checking.
Tom
Tom Verhoeff wrote:
Frank wrote:
As you probably know, GPC currently converts all identifiers to first letter upper-case/rest lower-case. Since this is often not nice, I plan to change this. This would affect at least the following things:
It is not clear to me what change you propose.
Only what I wrote, i.e. (optional) warnings, error messages, file names (in some certain situations).
You probably mean (because this is easiest to implement) to (optionally?) drop the conversion altogether and consequently have case-sensitive identifiers.
Of course not.
This also seems easy to implement (at the expense of a slightly more expense compare for lookup) and more in line with Pascal:
When creating a symbol-table entry you store the casing as found. When looking up a name you compare names modulo case (e.g. convert both comparands to lower case), when you get a hit you use the actual casing found in the symbol table.
For the technical details: I'll probably store both a canonical casing (for lookups) and the given casing (for messages etc.). Since lookup is not done as a series of compares, but using a hash etc., it's important to have a canonical form there. (Part of this lookup is done in the backend, and would be hard to change for us, and it's also more efficient this way.)
Markus Gerwinski wrote:
- Any ideas for the name of the option? (`-Widentifier-case'?)
Sounds good to me.
(If "on" is the default, the option should IMO get another name. E.g. `-Wignore-identifier-case'.)
`-Wno-identifier-case' then -- that would be quite sure, since most options come in `[no-]' pairs.
- Should it work across units/modules? Or should it be a tri-state option (never/current file/global)?
Tri-state, I suppose. If you yourself care for identifier cases, but use units by someone who doesn't, it would be good to disable the warnings for that units.
Name for the 3rd option? `-Widentifier-case-local'?
One question: What convention will GPC use instead of the old one? Will the asmnames be verbatim as defined? (E.g. if I write "type myFoo", the asmname of that type will be 'myFoo'?)
I intentionally didn't mention asmnames, because I'm not going to change them now. They're related to qualified identifiers, routine overloading, etc., which is a bigger mess of changes (for which I don't have the time now).
Most of the following comments are related to them, so they're not current now, but I'll comment, anyway.
Pierre Muller wrote:
At 10:23 09/01/2003, Frank Heckenbach wrote:
As you probably know, GPC currently converts all identifiers to first letter upper-case/rest lower-case. Since this is often not nice, I plan to change this. This would affect at least the following things:
What is your new rule? (A) -- all lowercase? (B) -- using case in declaration? Here you will face a problem that we (Free Pascal developpers) already faced with our 'cdecl' modifier'. For forward'ed function, which case should be used?
- the case used in the first declaration (with forward)
or the case in the second true declaration of the function.
(C) - other??
My ideas are the following:
From the Pascal programmer's viewpoint, the default asmname is
"anything" (i.e., don't rely on it, and use an explicit `asmname' when you need it in C or so).
A simple `external' directive will convert the Pascal identifier to lower-case. This may be useful for some C functions, but generally I'd expect `external' to be used together with `asmname'. This is possible now, so I suggest to write such code already now, so it can remain unchanged later. (Additionally, the BP or Delphi syntax `external name 'foo'' can be supported then.)
The directives `c' and `c_language' are then obsolete and can be dropped.
-Debugging.... The first char up and the other down is a default combination that I added in the p-exp.y file of GDB sources to get better case-insensitiveness in GDB behavior. (Remember, I am the official pascal language maintainer for GDB).
Indeed, this may be a problem. BTW, what you do you in FPC with overloaded routines, same identifiers in different units etc.? Can they have their Pascal names encoded for gdb, or do you have do to "demangling" in gdb? If the latter is the case, we'll have to define some mangling rules in the future (maybe the same that FPC uses) -- but even then, I recommended Pascal programmers not to rely on them, and to use them only for gdb.
-Interaction with C sources.... Most standard C sources use names with all chars downcase, so if you choose the same behavior you will have overlaps between C and GPC identifiers. This seems to be a very dangerous change, because some GPC specific functions might become the standard function after this.
I'm aware of this. AFAIK, this was the reason why the first-uppercase rule was introduced (long time ago). For now, this will remain (for asmnames), and in the future the mangling will have to be defined such as to avoid such conflicts.
Another point is that, in Free Pascal, we use uppercased names for normal pascal variables and functions and lower case variables for internal functions (like for operator overloading) so we would get into trouble if we would do the same change. But I don't know the GPC internal and can not tell if this problem could also appear inside GPC.
BTW, currently overloaded operators get asmnames like `plus_Integer_Integer' (so they don't conflict with lower-case C functions or Pascal routines), but this can also be changed in the future together with asmnames of routines and variables ...
Wood David wrote:
However, I don't really support compiler changes which force legacy code to be changed, regardless of whether there are new warnings or not. I can live with the change and update the code but it is an irritation to our small community number of code developers.
I understand this. However, the asmname and related issues are one major incompatible change that must happen sometime. I really mean must -- different units/modules can have the same identifiers, which with the current rule get the same asmname and conflict at link time. Since both EP (modules) and BP (units) allow this, we'll have to change the convention. But again, not now ...
Given that most Pascal built-in's such as ReadLn and WriteLn (or writeln or WRITELN!) are redefinable identifiers, are all these variations going to be flagged up as warnings. I see a long road ahead to get back to squeaky-clean code!
I think I can arrange for predefined identifiers (and keywords) not to have an "enforced" spelling (though I'd like to ;-). I.e., it would be ok to always write `WriteLn', or always write `WRITELN' etc., but not mixed.
CBFalconer wrote:
I actually think that your existing convention (1st upper, rest lower) is an excellent way to resolve most things. I am quite sure I don't have a grasp of the conflicting systematic requirements involved. It might be helpful to enumerate them.
As for asmnames, I'm not quite sure yet myself. In particular, which characters can be used in assembler identifiers on all systems (which may be needed for the "mangling"). I think I tried `$' for a special identifier and it failed somewhere. I don't know if `.' or any other character will work everywhere. Otherwise (if only alphanum and `_' works), the mangling might get even more tricky ...
Frank
I wrote:
- Default? (I suppose off, though I think I myself would prefer on.)
There have been some votes for both sides. So, would anyone object if off is the default, and `-Wall' implies on? (Those who want the other `-Wall' warnings except this one can use `-Wall -Wno-identifier-case' then.)
Frank
In article 200301112240.XAA13107@goedel.fjf.gnu.de, Frank Heckenbach frank@g-n-u.de writes
I wrote:
- Default? (I suppose off, though I think I myself would prefer on.)
There have been some votes for both sides. So, would anyone object if off is the default, and `-Wall' implies on? (Those who want the other `-Wall' warnings except this one can use `-Wall -Wno-identifier-case' then.)
Sounds fine to me.
Frank Heckenbach wrote:
There have been some votes for both sides. So, would anyone object if off is the default, and `-Wall' implies on? (Those who want the other `-Wall' warnings except this one can use `-Wall -Wno-identifier-case' then.)
So the full convention would be:
no options: No case checking. -Widentifier-cases: global case checking. -Widentifier-cases-local: case checking for compiled file(s) only. -Wall: includes case checking. -Wall -Wno-identifier-case: All options without case checking. -Wall -Widentifier-cases-local: All options, but restrict case checking to compiled file(s) only.
Right?
Yours,
Markus
Markus Gerwinski wrote:
Frank Heckenbach wrote:
There have been some votes for both sides. So, would anyone object if off is the default, and `-Wall' implies on? (Those who want the other `-Wall' warnings except this one can use `-Wall -Wno-identifier-case' then.)
So the full convention would be:
no options: No case checking. -Widentifier-cases: global case checking.
s/cases/case/ (or not?)
-Widentifier-cases-local: case checking for compiled file(s) only.
And without comparing the case to that used in an imported unit/module.
-Wall: includes case checking. -Wall -Wno-identifier-case: All options without case checking. -Wall -Widentifier-cases-local: All options, but restrict case checking to compiled file(s) only.
Right?
Yep.
Frank
Frank Heckenbach wrote:
-Widentifier-cases: global case checking.
s/cases/case/ (or not?)
Oops, yes, of course.
-Widentifier-cases-local: case checking for compiled file(s) only.
And without comparing the case to that used in an imported unit/module.
Okay.
Markus