CBFalconer wrote:
Why not start the thinking from what is as compatible as possible with standard Pascal techniques.
Since you bring up the issue of ISO 7185 ("classic" Pascal) vs. 10206 (Extended Pascal) again, would you like to address the following limitations of classic Pascal? (Excuse me if I sound a little like Brian Kernighan, but I think these points are valid criticisms of classic Pascal. Most of them have been addressed in Extended Pascal -- and, I have to admit, most of them also in BP, though often differently.)
- No variables of dynamic size. You can't, e.g., get a number of elements at runtime and do something with them.(*) Without using standard compliance level 1 (conformant array parameters), you can't even have routines that work on arrays of different size, so common routines have to be copied for each possible size.
(*) The usually suggested work-arounds include:
- Set a maximum size at runtime. This is suitable for demo and learning programs, not for the real world (unless perhaps the size is made unreasonably large, but this will waste too much memory).
- Use linked lists. This works in some cases, but often it adds an O(n) factor to the complexity and is therefore inacceptable.
- Use more complicated structures such as balanced trees. This will add a lot of unnecessary programming overhead.
- Strings padded with spaces. (EP shares this defect in some respects, e.g. string comparisons.) Treating `foo' and `foo ' as equivalent might work sometimes, but will already fail when writing them, followed by other stuff in the same line.
- No modularization. Often used routines have to be copied (since there's also no include) into each program that uses them which becomes a maintenance nightmare when modifying them.
- No file names. Classic Pascal programs can only access Input, Output and files declared in the program header. It's not possible to choose a file name during runtime.
- No way to access routines written in other languages or system routines, except through the few predefined routines.
- No `otherwise' in `case'.
- No defined order of evaluation, in particular for Boolean operators. This makes especially loops quite a bit more complicated to write (examples can be found in Kernighan's paper).
- No way to escape type-checking or "untyped pointers". Something like your nmalloc cannot be written in classic Pascal.
- No bitwise operators (and, or, xor). It's possible, but very cumbersome and inefficient to implement them using arithmetics.
Unless you can explain how to overcome those restrictions or how they are irrelevant, I'm afraid I can't take your cause for classic Pascal for real-world programs too seriously.
Do you actually stick to these limitations of classic Pascal, or do you use some extensions after all? (I'm reminded of a Usenet discussion I had with someone about type escapes. First, he claimed he can do anything by purely standard means, and when we finally got to the critical points, he admitted to just abusing variant records (which is in no way guaranteed to work by the standard) and using assembler code to do the type conversions. This is not very convincing to me.)
Actually I'm wondering, what's your point in bringing this up again?
Is it compiler-portability? That's a valid point(*) -- but then, of course, you must not use any extensions or rely on any implementation-dependant or implementation-defined features.
(*) For classic Pascal; for EP there are very few compilers at all, and BP is too unprecisely defined, so the several existing "BP compatible" compilers, including GPC, differ on details which are undocumented in BP, so much that you don't usually get portable programs without using many compiler conditionals.
Or do you want to prove that any program can be written in classic Pascal? Sure, I'll give you that. Then again, any program can be written on a Turing machine.
The difference is that one is a little more comfortable than the other. And that's my reason for extensions such as this one. With existing means, you have to modify two places to add one entry. It's more comfortable to have to change only one place.
There one would create an array of messages with a variation on:
CONST maxmsglgh = 30; (* kept short to minimize my typing *) maxmsgcnt = 3; (* and here also *)
TYPE msgid = 1..maxmsgcnt; (* or an enumeration *) amsg = PACKED ARRAY[1..maxmsglgh] OF char; msgs = ARRAY[1..maxmsgcnt] OF amsg;
VAR resultmsgs : msgs;
(* 1--------------1 *)
PROCEDURE initresultmsgs(VAR msggrp : msgs);
(* 2--------------2 *) PROCEDURE initonemsg(ix : msgid; themsg : amsg); BEGIN (* initonemsg *) msggrp[ix] := themsg; END; (* initonemsg *) (* 2--------------2 *) BEGIN (* initresultmsgs *) (* 123456789-123456789-123456789- *) initonemsg( 1, 'did not terminate with status '); initonemsg( 2, 'terminated with status '); initonemsg( 3, 'was teminated by signal '); END; (* initresultmsgs *)
(* 1--------------1 *)
.....
initresultmsgs(resultmsgs);
Which, once set up, is easily modified, allows the bulky initialization code to be segregated and gotten out of the way on suitable platforms after use, etc. The nuisance is that each message has a fixed length. We can fix this by modifying the type of amsg to include a length field (say lgh), and doing eventual writes with this as a parameter, such as:
write(fp, resultmsg.body[ix] : resultmsg.lgh);
which can in turn be encapsulated in a convenient:
PROCEDURE writeresult(VAR fp : text; ix : msgid; VAR msgset : msgs);
At least that is the way I would attack the job :-)
It won't surprise you that I have a lot of objections to your method:
- You add two procedures to initialize one variable. In a program with many initialized variables, this becomes quite a lot of procedures. (Besides, in classic Pascal they have to be separated in location from the respective variables due to the fixed order of declarations.)
I prefer to write compact code (which is IMHO much more readable). This may be influenced by the fact that I'm not paid by LOC. ;-)
- To add one entry, you still have to change 2 places. If it's not the last one, you have to change N indices additionally.
You wrote: "Notice that all the above is easily converted to use an enumerated set of msgid to avoid the use of magic constants." That's true, but then you might need some comments to easily match the enum IDs to the initialization lines (which is again extra effort). Also, it adds another set of (global!) identifiers, just for one variable -- not very convenient (unless you need the identifiers, anyway).
There are more identifiers added (types, constants, etc.). Your use of urdabrvids (unreadable, abbreviated identifiers, IMHO) suggests to me that this is not a good idea. Using fewer identifiers allows you to use more expressive names that don't get ridiculously long ...
- You have to count the characters (as indicated by your `123...' ruler). Sure, also with an initialized arrays of EP strings, there's a limit, but the compiler will complain when one value is too large, and when it's too small it will only waste a little memory, so it's no real problem to declare a reasonable size and be told by the compiler if it gets too small.
- The handling of the length with `:' is specific to `Write'. Other usages have to be specially coded.
Also for `Write', the suggested encapsulation doesn't help all that much. Instead of
WriteLn ('The process ', ResultMessage[WaitPIDResult], Status, '.');
you'd have to write:
Write ('The process '); WriteResult (Output, WaitPIDResult, ResultMessage); WriteLn (Status, '.');
Quite clumsy, isn't it? (Apart from that fact that having to add `Output' in `WriteResult' is less than intuitive ...)
(And, of course, WriteResult is bound to an array of fixed size, so you can't use it for different lists of messages, unless they happen to have the same number of messages, or you're willing to fill them up with dummy entries (strings, not chars this time). This is the same problem on the outer dimension.)
- You wrote: "Even the generation of the suggested lgh field can be automated in the initialization procedure, by having it measure the number of trailing blanks present. That leaves the only real 7185 nuisance the necessity of typing in those blanks in the first place, and possibly the added storage space used." -- And added run time, and most of all, the impossibility to define strings with trailing blanks.
There might be more problems. My experience with such code is that it distracts so much attention on the formalities (both when writing and later when reading/modifying the code) that it makes it harder to focus on the "real" code.
The point of any extended syntax is to ease that work, but NOT to change the eventual calls that use it. The tools you have in 10206 include strings with capacities and actual lengths. There is no need to make the end code look anything like C, but there is a need to make it clear.
I don't quite understand these remarks. The discussed extensions would not change any calls (or make it more like C in any way -- what gave you this idea???). It's just about simplifying the declaration by allowing the programmer to omit a value which the compiler can easily determine itself.
Extensions are like optimization. The first rule is don't do it. The second rule is don't do it yet. The third rule is think it over again.
I agree for pointless extensions. But extensions which make programming more comfortable and don't entail serious problems are not pointless IMHO.
(Likewise for optimizations, BTW. Much as a generally dislike low-level tricks, I'm all for algorithmic optimizations when reasonable. In a project I worked on recently, I easily achieved a 100* speedup, just by storing things in suitably linked data structures, rather than looking them up again and again. Without any dirty tricks.)
Adriaan van Os wrote:
What I don't like is that another identifier is introduced within some other declarations. Enum types do the same, and this has caused some extra work in the compiler. Since it's possible to get the upper bound as `High (s)' or `High (a)' (BP compatible feature), I think it's not necessary to introduce a new identifier in the declaration.
The constant identifier could be optional:
var s: String ( const ) = 'Hi'; a: array [1 .. const ] of Integer = (42, 17, 99);
Sorry, but this is exactly the opposite. This way, we'd not only have the complexities of handling the identifier (in case it's there), but also two syntactic variations. So this in example of an unwarranted extension IMHO. One way seems useful, two ways are redundant. And the extension should be kept as simple as possible (i.e., *no* additional identiffier, since it's alread possible to declare it with existing features).
Besides, using `const' as a placeholder looks quite strange. To me, it would suggest something like the variable is constant, not the upper bound is omitted.
Frank