John L. Ries wrote:
In my humble opinion, GPC should continue to be a native Pascal compiler, not a translator. It should be updated to use the current GCC back ends (Waldek has done some of this in the past and I accept his characterization of the process; but I think it's necessary if GPC is to continue to be viable); and the goal should still be to get that aspect of the code completely up to date so that GPC can be added to GCC. And I think it highly important that it be made easy to create binary packages (I can probably devote some time to this, starting with a SlackBuild script, but it will go quicker if more than one person is doing it, especially since for me it would be a learning experience) and do whatever else can be reasonably done to get GPC into people's hands (to including making Linux, Windows, and OSX builds available for download from the website), remembering that it appears to have completely disappeared from Linux distros, and extant collections of UNIX toys.
I assume that the codebase is maintained in some sort of revision control system, such as Git. If it isn't, then it should be.
Current version is at:
https://github.com/hebisch/gpc
I'll also accept Frank Heckenbach's characterization of the state of the codebase (a mess), as I doubt things have improved since he last posted on the subject. A rewrite is probably in order, but that is best determined by those who have some familiarity with said codebase. Frank suggested that GPC translate to C behind the scenes, but I fail to see what that would buy us, except the elimination of the need to interface directly with the GCC back end, which is apparently a pain (but perhaps that is a good enough reason by itself). In any case, a rewrite would of necessity be a long term task and should probably be done gradually.
I do not think rewrite is a good idea. In many cases code is messy for a reason: the task it has to do is messy. When planning rewrite one looks at big picture and gets impression that things can be organized neatly. But messines comes from little details which are not visible in big picture. Rather, the correct approach is constant restructuiring.
One problem is that GCC backend puts considerable constraints on the frontend. So loose coupling between frontend and the backend could bring better structure. But then we have to reimplement several parts that we currently take from backend. For example backend constant folder is several thousneds of lines -- using our data structures we would have to duplicate it.
I've never written a compiler (though I have worked on interpreters from time to time starting in college), but if I were to write a Pascal front end to GCC from scratch, my plan (until I knew better) would probably be to use bison and flex to specify as much of the syntax (and meaning thereof) as possible, adding such C (or C++) code as is necessary to communicate with the back end and the user.
Yes, syntax is mostly handled by flex and bison: corresponding source files makes about 7% of frontend sources.
It would be more elegant to write it in Pascal (as GNAT is written in Ada), but it would require a Pascal compiler to compile it and there aren't many of those in common use anymore (Free Pascal being the most visible one of late). The runtime library, however, would be written completely in Pascal and compiled with GPC, though it would probably call standard C functions to handle low level I/O and memory management (at least);
Runtime is mostly in Pascal, but part of OS interface is in C.
such would be facilitated by a utility to translate C header files into Borland-style Pascal units (or Extended Pascal modules). Since this would be a front-end to GCC, it would have to interface at least indirectly with the C runtime anyway. I'm *guessing* that most of the compiler maintenance would involce updates to the GCC interface and bug fixes; and that most real development would be on the runtime (which would be written in Pascal), but again, chances are excellent that I have no idea what I'm talking about.
Actually, large parts of real developement were on compiler proper. One thing is to implement desirable language extensions. First, there are still unimplemented corners of Extended Pascal (mainly set schemas). Much of Object Pascal is done, but there are nice features (views) missing. ATM there is no exception support (I have non-working code -- it handles syntax, but couses miscompilation in several cases). It is not clear if overloading is desirable -- ATM it is available for operators, but not for functions and procedures.
Another thing is error checking. First, to catch errors requires nontrivial effort. Second, even more effort goes into generation of sensible error messages.