Scott Moore wrote:
Some brief notes on your presentation "Quo vadis, GPC" (Where is GPC going).
I'd of sent it direct, but I note your email is not working.
I'd like to keep this discussion on the list, since it affects all GPC users. (Same for technical discussions in the past, according to the old Usenet principle, ask in public, answer in public; that's why I generally didn't like personal replies to mailing list posts. Though actually I disabled my autoresponder recently, so my mail should be "working" currently.)
First, I also considered, and was strongly suggested, to use GCC as a backend for my compiler. I rejected this idea because, primarily, of the very tight coupling between the front and back ends of GCC. This is in contrast to the well developed intermediate code coupled systems outside of GCC.
The other issue is the, frankly, hostile attitude of the GCC folks towards Pascal. Fortran and other compilers are distributed with GCC and get better support from the GCC group. I can't help but think that this comes from the C group perception of Pascal as a rival, which it used to be but is no longer.
I can't disagree here. Maybe the hostility has faded somewhat recently, but the mere fact that GPC is several backend versions behind makes real cooperation and integration basically impossible.
As for making GPC a front end for another compiler, I can't think of a better way to insure that GPC dies quietly. Virtually all of the "cascade" compilers, including Cfront, had severe difficulties, including becoming unnaturally coupled to one particular backend, and angering users by giving errors from the underlying compiler that the front end didn't catch. C++ became a success because Stroustrup moved quickly away from Cfront and onto a true compiler for C++.
Well, this idea was basically my last resort. I was already skeptical about it and your comment further discourages it.
GPC started out with a big advantage due to having a high quality back end with wide implementation. However, you detail well the unseen costs of that, including a moving target back end specification, inability to write the compiler in its own language, etc.
I think in particular the former (moving target) was much underestimated (though Jukka can hardly be blamed for in when he started GPC in 1988; without clairvoyance, he couldn't have foreseen how GCC would develop). The latter is even partially dependent on the former -- if the backend interface was stable, it would have been worthwhile to write a Pascal interface to it and write the frontend in Pascal. But moving as it has been, this would probably have added even more complications and headaches.
Pascal scores low as a common implementation language, but as you detail, most of this is because of a movement to interpreters and away from true compilers. Although this is stated to be "because of ease of development", I believe in large part this is a reaction to the difficulties of C/C++ development, which is directly traceable to the complexity of the language and the difficulties with debugging a language totally lacking in basic type security. I think nothing underlines this more than Anders' move to Microsoft and C#: Anders went from taking a language designed around type security (Pascal) and tearing away its type security, to taking a language designed around a lack of type security (and proud of it) C, and adding type security to that.
I'd rather leave C out of the discussion, as in my article, but C++ is actually rather type-safe if used "correctly" (e.g., STL strings, lists and other containers are type-safe (and memory-safe), "new" returns a typed pointer unlike "malloc" etc.). In my experience with C++, I didn't need more type-escapes than I did with (GNU | Borland) Pascal. (The main disadvantage is probably the confusion between "char" and an integer type -- quite annoying to me often, but not as dangerous as untyped pointers etc.)
As a side note, many of today's popular languages aren't really type-safe. They're not strongly typed, and though type errors don't usually cause program crashes, they result in runtime errors (when they trap runtime type checks) or wrong runtime behaviour (when types are silently mixed up, e.g. numbers automatically converted to strings). Though this may not look as ugly as a segfault, in effect it's basically the same -- the program doesn't do what's intended. I'm not convinced of weak typing in most situations.
Likewise, "ease of development" sometimes just means that it's easier to get a programm running at all (though wrongly) because there are not as many checks done before execution starts (i.e., the whole compile-time errors are reduced to a few parsing and other pre-interpretation checks, sometimes none at all). So one gets a sense of success more quickly, but the path to a correct program is not easier at all (in fact, IMHO, more difficult, since runtime errors have to be searched and debugged, while compile-time errors are found automatically). But it's no news that an initially flat learning curve is often confused with ease of use or even power (a certain big software company's business model depends on this fallacy).
I completely agree with your diatribe on automatic destructors. I choose automatic (and anonymous) constructors and destructors for the language Pascaline (a highly extended version of Pascal) because I believed it was necessary for the compiler to both control when, and in what order, the constructors and destructors are called.
And for the class designer to control what must be done during con-/destruction (which, the more I think of it, I consider the worst problem of the BP object model, where con-/destructors are basically just a hint to users of the class to call them, without any enforcement).
I also believe (as shown in Java and C#) that classes should be treated as code structuring constructs and NOT as "extended records", and that both static and dynamic objects have their uses.
Here I disagree with your first point. In C++ I do in fact use some classes as "extended records". Sometimes they're basically data records with a constructor (e.g. because initialization would usually specify just one or two of their fields, while all the other fields are always initialized to the same value, so a constructor can do it in one place, instead of having to do it in each initialization). Other classes are full-blown "active" objects in class hierarchies; and then I have almost every shade in between. (FWIW, I also don't believe in bureaucratic "encapsulation" for the sake of it, i.e. if my class has a field that's validly accessed from outside, I make it public, and not private with public accessor methods.) It's said that C++ is a multi-paradigm language, and though I usually don't care much about such phrases, in this regard I would agree.
In Pascaline, classes are expressed as modules which can be instantiated, and fit within modules in program structure (programs are a series of modules which may contain classes).
If a module roughly equals a source file here, I'd also disagree. Especially (but not only) because of my "extended record" classes, this would mean very many source files, which tend to decrease readability for me, so I prefer to be able to put several classes, along with other global declarations in a single "module" as C++ as well as BP and GPC allow. (C++ also allows to split a class implementation between "modules" (compilation units, i.e., basically source files), which I don't like and never do, so I agree with the Pascal models here.)
For more on that, I invite you to look over the Pascaline specification:
I did look over it, of course, given its length, not in every detail. Some of the substance looks interesting, if you excuse the pun, though I also noted some ambiguities and other possible problems -- but that's off-topic in this thread, we might discuss this privately.
I looked at templates for Pascaline, and I used your exact example to evaluate it, that is, general handlers for list structures. However, I came to the conclusion that this was also the paramount example for classes, and in fact classes are an example of a highly structured system of typing that reduces the need for the highly unstructured method offered by templates
I'm not sure how classes replace templates. How do you define a class that is a list of foo where foo is any given type? Even if we leave out basic types such as Integer (which I wouldn't leave out in practice, but just for the sake of argument) so if foo is an object type, and I want only objects of type foo (and possibly its descendants) in the list, not any other object type, and of course, with compile-time checking, AFAIK there is no way do do this in any object model I know (Pascal or C++).
To be clear, of course it's possible to write such a class for a single type T, and it's also possible to write an untyped list (using run-time type-checks and type-casts) or a totally polymorphic list (i.e., enforcing a single parent of all object types and making a list of those, but again, it would require run-time checks if you want to make sure the actual elements are of type foo or descendants).
What templates allow is to write a list implementation once and use it for any type T, with strong, compile-time type-checking. If you can do this with classes, let me know, because as I wrote, templates are one of the most important missing features in Pascal for me currently.
BTW, I also wonder what you mean by "highly unstructured". Apart from the new syntax one has to get used to like any new syntax, I see templates as very structured and type-safe.
(which by the way, Wirth is on record as saying was not such a hot idea).
If I was mean, I'd counter that Wirth apparently didn't think variable-sized arrays were a hot idea either, in the design of the original language. And this omission, IMHO, more than anything else, has contributed to the notion that Pascal was suitable only for teaching and not for real-world programs (and IMHO quite rightfully so, when talking strictly about the original language without even conformant arrays, even though they're just a partial solution). WRT "generic types" (in whatever form, templates or other), the limitation is less immediate, but honestly, having to "reimplement the list" is IMHO just not up-to-date for a modern language. Sure, in academic programs that need a list type, you'd just insert an ad-hoc implementation, but in real-world programs that need perhaps lists of 10 different types (not uncommon in my programs, since I'm an adherent of strong typing, see above, and I won't put everything in a single "list of object" type), having to include 10 such list implementations is just annoying. (Of course, if you have a preprocessor, you can let it do some of the work. I tried this in my "gp" program; if interested look at list.inc. In short, it's ugly.)
BTW, do you have a reference for Wirth's quote about templates? (I'd like to know what alternatives he suggests.)
Although I am not sure I see the sense of your basic argument (interest in GPC is falling off, so lets rewrite it).
My interest in the current way of developing GPC, i.e. GCC based, has declined, so the alternatives are to either rewrite GPC without GCC, or to rewrite my code in another language.
I would say it makes equal sense to simply start moving parts of GPC into its own language and away from C. If you were to write a code generator to replace the back end, in Pascal, you would both be independent of GCC and much farther along on your goal of a GPC rewritten in its own language.
That's not actually my main goal, it would be a possible side-benefit to me. My main goal would be to have a maintainable compiler, which for me would be equally well possible if written in C++ or Pascal. Though written in Pascal, it would probably attract more co-developers.
However, the backend is much more than just a code generator. For a start, it's various code generators for different platforms (and I'm not interested in writing a new, non-portable compiler; to me that would be a big step backward, in return for a large effort), including debug info generators (also quite non-trivial), and of course optimizers (and again, I'm not interested in writing a non-optimizing compiler; since it's not an academic exercise for me, but I have actual Pascal programs, some of them perfomance-intensive, and I wouldn't like them to run much slower). AFAIK, many man-years (or decades) have been spent in gcc's various optimizers, and I think there's no way we can even begin to match that. That was basically my motivation for suggesting to use a high-level language such as C++ as the target, so its compiler could do the code generation and optimization for us.
Besides, if we were to replace the backend, we'd either have to make the new one API compatible, which doesn't help making the frontend more maintainable, see my comments about the TREE_NODEs, or rewrite those parts of the front end that deal with tree nodes (which is basically the same we'd have to rewrite for a C++ target, i.e., everything not listed under "Reusable parts").
Frank