Frank Heckenbach wrote:
It may be natural for some. I actually prefer it the way it is, i.e. I prefer to use the same spelling (capitalization) throughout. So if you or someone else implements the other way (which is indeed more work), it should at least be optional.
What you prefer is not representative of what others prefer. Testing for case matching is already optional, given that there is a flag.
BTW, calling it a bug is a bit strange, anyway. In Pascal, even occurrences of an identifier referring to the same declaration can have different capitalizations; the same applies to different declarations (e.g., local vs. global) as well. Therefore, all this is an optional warning, not an error.
It is not a Pascal bug in the sense that the program will still compile correctly since Pascal accepts any capitalization. However, it is a bug in GPC in the sense that the compiler treats the variables inconsistently. FUnCtiOnaLLy iT dOEs nOt MATteR, but for consistency it does.
Frank Heckenbach wrote:
Waldek Hebisch wrote:
I belive that capitalization is meaningfull for people.
I don't really. In natural language, capitalization is used as a help in reading (used differently in different languages), but it does not carry meaning itself (e.g., a text written in all-caps still has the same meaning).
It carries meaning to PEOPLE but not to the compiler (for the purpose of creating a binary). Especially in German it means that the word is either the beginning of a sentence or a noun, right? Are there cases in German - I bet you could find or make some! - where the meaning of the sentence changes dramatically if you were to change the case of a noun?
This was discussed earlier, here's an example
http://www.gnu-pascal.de/crystal/gpc/en/mail7664.html
where in 2003 (!) Russell Whitaker wrote:
Irregardless of either Case Checking or Warning messages, I would like to see the compiler output identifiers exactly as they appear in the offending source line. As it is now the compiler outputs 'Gothere' when the source could have been either 'gotHere' or 'goThere'. (bad pun intended).
Got Here is quite different from Go There!
So the capitalization within a single identifier name CAN carry meaning. A person wanting to use both of these would be FLABerGASted that they could not. (Sorry about that, it just apPEARed and now you've got me going!).
- Short identifiers such as `x' vs. `X'. These names are not actually meaningful anyway.
No. X could represent Xavier, Y could be Yanni.
Tom Schneider wrote: http://www.lecb.ncifcrf.gov/~toms/delila/module.html ftp://ftp.ncifcrf.gov/pub/delila/moddef
A point I forgot to mention is that, of course, using the module program I transfer chunks of text between programs. The current rules force me to have a single way of capitalization.
FOR EXAMPLE. I am a molecular biologist, so the letters A, C, G and T have great significance. In some programs I might want to capitalize these to fit the convention in molecular biology. But in another piece of the code I may decide that 'c' is to be the name of a character read in. These would be distinct uses with different scopes. But the GPC compiler demands that I use the same rule for both.
Essentially that means that I should forget about capitalization entirely or have my arm twisted every once in a while. Irrelevant warnings that I can't get rid of make it MUCH more difficult to locate the important warnings!
The difficulty came up when I was preparing my 200-some programs for other people. When they compile the code they will trip over EVERY unnecessary warning. They will write me email every time, wondering what is wrong. I just got some! I would have to tell them not to worry about that. But in case there is a REAL problem, I have to check every time ...
Worse, if one of my students decides to write code, I have to worry about what they have done in modules that I'm NOT working on. Suppose I ask them to write a procedure for me and they pick a "non-standard" capitalization that doesn't match something I'm doing? Or suppose I want to use a sorting routine from someone else? I should be able to plug their procedure or function into my program and have it run without warnings.
Saying that I should use external files is NOT A SOLUTION because there is no standard that all compliers use for reading such files and if I used that approach, I would have to keep modifying code to switch compilers. I (and you now) already have the system-independent module transfer mechanism so there is no reason to go back to system-dependency. (GPC is essentially a system dependency since there are other compilers that exist. But, by the way, I'm going for using GPC as my main compiler now thanks to all of you folks and because the Sun Pascal compiler has been dropped.)
Tom Verhoeff wrote:
I recently recompiled a project containing some seven thousand lines of Pascal code distributed over some twenty units. A couple of years ago when it was last compiled, gpc didn't have the identifier-case checking feature. During the recent recompile, I was overwhelmed with casing complaints. Some of them were legitimate: same variable referred to by different casings. And I changed the source accordingly. But many warnings were completely unexpected for me, and unwarranted.
This is similar to what happened to me. I just upgraded from the 1999 version of GPC to 2004 and suddenly had hundreds of WARnings. (Using capitals in varIouS odd ways in a sentence could be an art form to create double MEANings.)
Waldek Hebisch wrote:
- Many tools are easier to use as case sensitive ones (simple example is global regex search&replace)
Yes. This is why the GPC convention of capitalizing the first letter of an identifier when giving errors or warnings is painful. One can't use the capitalization. Indeed, the error/warning should always be using the current capitalization string! (This is a separate issue from the scope problem, but it also trips me up repeatedly since I like to use cut and paste to do searches in vi/vim.)
The compiler already keeps track of scope for each variable. So keeping one copy of an identifier just (apparently) reveals
Again: The warning is about *identifiers* (including keywords actually), not about *declarations* (and it doesn't claim otherwise).
Fine. Any constant, type, variable, procedure, function name ...
Frank Heckenbach wrote:
that the GPC code is not efficient.
Not really. This way, the capitalization can be checked right after a word (identifier or keyword) is recognized. When doing it after distinguishing between keywords and various kinds of declarations, there would be more places to add the checks.
You of course know the compiler guts and I don't. But what you are saying implies that the mechanisms for capitalization checking and scope checking are separated. In that sense the compiler appears not to be efficient.
Frank Heckenbach wrote:
In other words, the warning code could (should!) use the code already used for tracking the scope.
As I said, I prefer to check the capitalization of identifiers. If you prefer to check the capitalization of declarations, go ahead and implement it (optionally). It's free software. But please contact Waldek before about the details to avoid conflicts with his qualified identifier changes.
It's a matter of doing it right. I am not the best person to do this as I have not written the code in the first place.
Frank Heckenbach wrote:
Because I don't see a "case sensitive interpretation" of Pascal code.
There may be other Pascal compilers which are case-sensitive, which is obviously non-standard. Our general attitude to non-standard compilers is to be at best backwards-compatible, which this warning surely makes it.
If there were (theoretically) a compiler that understood case with scope correctly, then this current design would not be compatible.
Frank Heckenbach wrote:
I recently recompiled a project containing some seven thousand lines of Pascal code distributed over some twenty units. A couple of years ago when it was last compiled, gpc didn't have the identifier-case checking feature. During the recent recompile, I was overwhelmed with casing complaints. Some of them were legitimate: same variable referred to by different casings.
Again, Pascal isn't concerned with the capitalization of identifiers referring to the same declaration any more than it is with that of identifiers referring to different declarations. If you consider one thing more "legitimate" than the other, that's your opinion, but please don't state it as a fact.
No, he means that some of the warnings were appropriate and some were not, according to the scope of the variable. The inappropriate ones get in the way of finding the legitimate ones.
Frank Heckenbach wrote:
Well, I was about to write "these comparisons go too far" before I read this paragraph, but now it's really too much! Emotions in a Pascal program, eh? ;-)
EMOTIONS ARE USED IN TEXT! (SorryForYelling.) Haven't you ever embedded something like that into your code?
Frank Heckenbach wrote:
Let's stick to Pascal based arguments, perhaps?
This is NOT a "Pascal" issue. It is a GPC issue. I think that we all agre that in the language Pascal itself, case is not relevant. The issue here is about programmers and what happens when they attempt to use capitalization to help them understand AND DOCUMENT the code better. If the capitalization does not follow the scope rules of the identifiers, then the programmer will OFTEN have trouble.
Frank Heckenbach wrote:
You forgot:
- NamesConsistingOfMultipleWords (most frequent in my experience)
True. But using different concatenations resulting in the same identifier (such as `FooBar' and `FoobAr' -- I know there are real examples of this, I just don't have them in mind right now) is
There's the case I mentioned above:
http://www.gnu-pascal.de/crystal/gpc/en/mail7664.html
Frank Heckenbach wrote:
asking for trouble in a case-insensitive language. In C, you can do this and get away with it. In Pascal, sooner or later, you might use those names in the same scope and get a conflict.
No. If they are within the same scope, then there SHOULD be a conflict.
Frank Heckenbach wrote:
So even here, I prefer the current behaviour which warns me as soon as possible and lets me choose another name for one of the things.
No, that forces one to have a uniform capitalization scheme across the program. Then, for me and my students, we must have one across my lab. If I send you procedures, then you must follow OUR scheme. So ultimateLy, to avoId conflicTs, yOu muSt follOw oUr capitalizatiOn scheMe, whiCh, In thIs sentenCe, Is To capitaliZe tHe secoNd To laSt charactEr Of eveRy woRd ;-0.
Frank Heckenbach wrote:
- conventions like: globals capitalized, locals lowercase
or agregates capitalized, scalars lowercase
IMHO these are archaic conventions, coming from languages such as C (where there's only one global and one local level, in contrast to Pascal which has arbitrarily many levels, and a local variable of level 1 can behave like a global variable, seen from a routine at level 2, etc.), and assembler (where there's a general difference between aggregates and scalars). So I don't care much for such conventions.
That misses the point. Within a chunk of code it may be appropriate to use a particular convention. The code that I was working on that started this discussion uses a series of X and Y variables to create a color density plotting program:
http://www.lecb.ncifcrf.gov/~toms/delila/denplo.html
using:
cat denplo.p |\ wordlist |\ egrep 'X|Y' |\ sort |\ uniq |\ fmt |\ cat
I got the variables:
SymbolSizeX SymbolSizeY
X XDisplayIntervals XDisplaySubIntervals XaxisLabel Xbin Xbinmax Xbinmin Xcolumn Xcorner Xhi Xin XinCount Xintervals Xlo XmaxValue XminValue Xminvalue XoutCount Xrange Xsize Xvalue
Y YDisplayIntervals YDisplaySubIntervals YaxisLabel Ybin Ybinmax Ybinmin Ycolumn Ycorner Yhi Yin YinCount Yintervals Ylo YmaxValue YminValue Yminvalue YoutCount Yrange Ysize Yvalue
graphXintervals graphYintervals jumpX jumpY keyX keyXsize keyY keyYsize nextX nextY shiftX shiftY shrinkfactorX shrinkfactorY startX startY tryX tryY
As you can see from looking over this list, the X and Y capitalization really helps to distinguish the variables.
It was very good that the compiler detected these (THANK YOU) as it allowed me to straighten out some incorrect ones. The convention helps me to tell which are the X and Y variables.
The collision was with a lower case x and y nicely scoped into local variables in a procedure. To avoid other users of the code getting the inappropriate warning (and sending me email about it!), I was forced to change them to capitals.
Frank Heckenbach wrote:
And we should divide programmers into two groups, one that does not care about capitalization and the second that cares. The first group produces lot of meaningless variation, but they are irrelevant for the discussion since they wish no warning. The second group IMHO is much more likely to associate meaning with capitalization.
So do I, yet I prefer to avoid different capitalizations of the same identifier (see above).
That's fine, but it does not suit everyone, and you apparently haven't run into a condition where there would be a conflict.
Frank Heckenbach wrote:
So is the lack of `endif', and a few other things IMHO. We can't change the fundamentals of Pascal, I'm afraid.
Agreed! But this discussion is not about changing the fundamentals of Pascal. It's about changing how GFP reacts to case inconsistency when it is requested to check for case consistency. Programmers will expect the case usage to match the scope of the variable.
Regards,
Dr. Thomas D. Schneider National Cancer Institute Laboratory of Experimental and Computational Biology Molecular Information Theory Group Frederick, Maryland 21702-1201 toms@ncifcrf.gov permanent email: toms@alum.mit.edu (use only if first address fails) http://www.lecb.ncifcrf.gov/~toms/
ps: Sorry if I messed up some of the attributions!
Dear all
Having skimmed through read the growing capitalisation debate, may I just suggest that a switch be added to turn of all warnings relating to capitalisation (if there isn't one already). Unless I am misunderstanding something, this should resolve the issues raised in this debate. Better still, the warnings should be off by default (although I have never seen one before, and my capitalisation is not consistent - so perhaps they are already off by default).
And, if Frank is not willing to implement what some people seem to want him to implement, then, as he says, the sources are freely available, and others can see to it. I think Frank will be the first to say that all assistance with developing the compiler would be gratefully received.
Best regards, The Chief -------- Prof. Abimbola A. Olowofoyeku (The African Chief) web: http://www.greatchief.plus.com/
Prof A Olowofoyeku (The African Chief) wrote:
Having skimmed through read the growing capitalisation debate, may I just suggest that a switch be added to turn of all warnings relating to capitalisation (if there isn't one already).
It's there (`-Wno-identifier-case').
Unless I am misunderstanding something, this should resolve the issues raised in this debate. Better still, the warnings should be off by default (although I have never seen one before, and my capitalisation is not consistent - so perhaps they are already off by default).
Yes, it's off by default, but turned on by `-Wall'. Perhaps I should change this indeed.
Then until someone (perhaps) implements their kind of warning, nobody will be bothered by the warning, unless they turn it on explictly.
I'd have to change my dozens of Makefiles etc., but I might rather do that than continue a pointless debate ...
Frank
Tom Schneider wrote:
Frank Heckenbach wrote:
BTW, calling it a bug is a bit strange, anyway. In Pascal, even occurrences of an identifier referring to the same declaration can have different capitalizations; the same applies to different declarations (e.g., local vs. global) as well. Therefore, all this is an optional warning, not an error.
It is not a Pascal bug in the sense that the program will still compile correctly since Pascal accepts any capitalization. However, it is a bug in GPC in the sense that the compiler treats the variables inconsistently.
It's one thing to disagree in a discussion, but if you keep offending me (by calling an intentional decision which is not contrary to any standard a "bug" or not "legitimate"), I'll quit this discussion and might killfile you. Last warning (definitely no pun intended)!
(Again, it is a warning about *identifiers*, not about *declarations* (variables etc.).)
Especially in German it means that the word is either the beginning of a sentence or a noun, right? Are there cases in German - I bet you could find or make some! - where the meaning of the sentence changes dramatically if you were to change the case of a noun?
Yes, and there are cases (in German and probably in other languages) where even the same sentence with the same capitalization can have different meanings. Natural languages are (slightly) ambiguous. Formal languages are meant to be an improvement (in this aspect), so such comparisons are void.
FOR EXAMPLE. I am a molecular biologist, so the letters A, C, G and T have great significance. In some programs I might want to capitalize these to fit the convention in molecular biology. But in another piece of the code I may decide that 'c' is to be the name of a character read in.
In your situation, I *really* wouldn't *ever* use `c' for such a variable, in order to avoid confusion, cf. the `GotHere' example below.
Irrelevant warnings that I can't get rid of make it MUCH more difficult to locate the important warnings!
In general I agree with this statement. But it doesn't apply here.
Worse, if one of my students decides to write code, I have to worry about what they have done in modules that I'm NOT working on. Suppose I ask them to write a procedure for me and they pick a "non-standard" capitalization that doesn't match something I'm doing?
Use `-Widentifier-case-local'.
Saying that I should use external files is NOT A SOLUTION because there is no standard that all compliers use for reading such files and if I used that approach,
If you use "external files" to mean modules, and "modules" (above) to mean include files or similar, it may help to use more standard terminology.
This is why the GPC convention of capitalizing the first letter of an identifier when giving errors or warnings is painful.
Where does GPC still do so?
Frank Heckenbach wrote:
that the GPC code is not efficient.
Not really. This way, the capitalization can be checked right after a word (identifier or keyword) is recognized. When doing it after distinguishing between keywords and various kinds of declarations, there would be more places to add the checks.
You of course know the compiler guts and I don't. But what you are saying implies that the mechanisms for capitalization checking and scope checking are separated. In that sense the compiler appears not to be efficient.
Lexing and parsing are separated, as in every other compiler and interpreter in the world. Or what do you mean???
Frank Heckenbach wrote:
Because I don't see a "case sensitive interpretation" of Pascal code.
There may be other Pascal compilers which are case-sensitive, which is obviously non-standard. Our general attitude to non-standard compilers is to be at best backwards-compatible, which this warning surely makes it.
If there were (theoretically) a compiler that understood case with scope correctly, then this current design would not be compatible.
There are such compilers, e.g., every C compiler in the world. I don't recall that we plan to be compatible to them. Such a compiler obviously wouldn't be a Pascal compiler.
Frank Heckenbach wrote:
I recently recompiled a project containing some seven thousand lines of Pascal code distributed over some twenty units. A couple of years ago when it was last compiled, gpc didn't have the identifier-case checking feature. During the recent recompile, I was overwhelmed with casing complaints. Some of them were legitimate: same variable referred to by different casings.
Again, Pascal isn't concerned with the capitalization of identifiers referring to the same declaration any more than it is with that of identifiers referring to different declarations. If you consider one thing more "legitimate" than the other, that's your opinion, but please don't state it as a fact.
No, he means that some of the warnings were appropriate and some were not, according to the scope of the variable. The inappropriate ones get in the way of finding the legitimate ones.
So you substitute "legitimate" for "appropriate". Playing with words, nothing more. Next time, you might want to add "IMHO" or something, otherwise I see no reason to reply to unfounded claims.
Frank Heckenbach wrote:
Well, I was about to write "these comparisons go too far" before I read this paragraph, but now it's really too much! Emotions in a Pascal program, eh? ;-)
EMOTIONS ARE USED IN TEXT! (SorryForYelling.) Haven't you ever embedded something like that into your code?
In comments perhaps. If you do this in your program code regularly, you might want to get professional help, sorry.
Frank Heckenbach wrote:
Let's stick to Pascal based arguments, perhaps?
This is NOT a "Pascal" issue. It is a GPC issue.
Incidentally the P in GPC stands for Pascal, in case you didn't know.
I think that we all agre that in the language Pascal itself, case is not relevant. The issue here is about programmers and what happens when they attempt to use capitalization to help them understand AND DOCUMENT the code better. If the capitalization does not follow the scope rules of the identifiers, then the programmer will OFTEN have trouble.
If the assumptions of the programmer do not match the behaviour of the compiler, then the programmer will also have trouble (perhaps not as often, but sometimes in more tricky ways, see the example below).
My preferred solution is to avoid both, by using capitalization (for declarations if you will) consistently *and* avoiding different declarations having the same identifier with different capitalization. IMHO that's the only way to pretend to be case-sensitive without getting bitten. It's a little restriction, compared to a real case-sensitive language, but one I'd rather accept than getting surprising bugs.
Frank Heckenbach wrote:
You forgot:
- NamesConsistingOfMultipleWords (most frequent in my experience)
True. But using different concatenations resulting in the same identifier (such as `FooBar' and `FoobAr' -- I know there are real examples of this, I just don't have them in mind right now) is
There's the case I mentioned above:
http://www.gnu-pascal.de/crystal/gpc/en/mail7664.html [...] Got Here is quite different from Go There!
Yes, thanks for digging it out.
Frank Heckenbach wrote:
asking for trouble in a case-insensitive language. In C, you can do this and get away with it. In Pascal, sooner or later, you might use those names in the same scope and get a conflict.
No. If they are within the same scope, then there SHOULD be a conflict.
So to repeat my argument with this concrete example, here's a (made up) code fragment which shows the problems I'm referring to. Adding the `with' statement (because it makes it more comfortable to access the `Target' field twice) shadows, in the statement between, `GoThere' by `Field[Position].GotHere'. This is a tricky bug, hard to find if you don't know it.
Of course, such unintended shadowing can happen with the same capitalization as well. But in this case, the programmer can notice it, whereas a programmer who thinks case-sensitively, as you suggest, will not even understand the problem.
You might argue that a declaration-capitalization-warning will catch this case. But only if compiled on such a compiler (you mentioned using other compilers) and if enabled (it's optional as you know). Whereas the identifier-capitalization-warning can catch it even before the actual conflict arises, as soon as the two identical identifiers with different capitalization are introduced, if it's compiled just *once* with GPC and the warning enabled. So you can notice it earlier and prevent the actual conflict by renaming things (there are always choices to rename things).
program Foo;
const Max = 20;
type t = record { Target of link from this position } Target: Integer;
{ Position was already visited } GotHere: Boolean end;
var Position: Integer; Field: array [1 .. Max] of t; PositionsToTry: set of 1 .. Max;
procedure p; var NewPosition: Integer;
{ Set the current position to NewPosition. Return True if successful. } function GoThere: Boolean; begin if NewPosition in [1 .. Max] then begin Position := NewPosition; GoThere := True end else GoThere := False end;
begin
{ Try next position } NewPosition := Position + 1; if GoThere then Exit; { OK }
{ Try other positions on the list } for NewPosition in PositionsToTry do if GoThere then Exit; { OK }
{ Try target of link from current position } with Field[Position] do begin NewPosition := Target; if GoThere then Exit; { Ouch! } Target := 0 { invalidate link } end;
end;
begin end.
Frank Heckenbach wrote:
And we should divide programmers into two groups, one that does not care about capitalization and the second that cares. The first group produces lot of meaningless variation, but they are irrelevant for the discussion since they wish no warning. The second group IMHO is much more likely to associate meaning with capitalization.
So do I, yet I prefer to avoid different capitalizations of the same identifier (see above).
That's fine, but it does not suit everyone, and you apparently haven't run into a condition where there would be a conflict.
I have, and I liked to get the warning. See the example above. Even if there was no direct conflict such as the one shown above yet, you are inviting one if you use the same identifier with different capitalization for different purposes. Pretending to be case-sensitive is nice, but you have to respect the limits.
Frank Heckenbach wrote:
So is the lack of `endif', and a few other things IMHO. We can't change the fundamentals of Pascal, I'm afraid.
Agreed! But this discussion is not about changing the fundamentals of Pascal. It's about changing how GFP reacts to case inconsistency when it is requested to check for case consistency. Programmers will expect the case usage to match the scope of the variable.
Since I'm no programmer (according to your definition), I can't help you here, sorry.
Frank