Bastiaan Veelo wrote:
We have an extensive code base of Extended Pascal that is in active use commercially, and we will continue development in many years to come. We are using the Prospero compiler, which is still serving our needs. But, as Prospero is Windows-only, gpc has always appeared as the most promising escape route to portability, should we need it. Therefore I am hoping that gpc will find its way into the future.
I have a small contribution to the discussion. Regarding the suggestion of turning gpc into a translator to some other language, I would suggest considering the D programming language as the target language, instead of C++. I have been following the development of D for many years, and have always seen similarities between Extended Pascal and D. Off the top of my head there are nested functions, modules, function argument storage classes, better type safety than C++, dynamic arrays with size information, and fast compilation (especially compared to C++). There may be more similarities. D's template design is also much better than C++'s, as is its approach to const-correctness and its alternative to multiple-inheritance. D is designed to be a better language than C++, and I think it is. I think that of modern languages D is most compatible to the way Pascal programmers like to think (pardon the generalization) and although its syntax is C++-inspired, D might be a good candidate to "take over" from Pascal. If gpc would do a good job at generating readable D code, Pascal programmers could choose to continue writing Pascal or make the switch to D completely and be happy with it.
Some pointers: http://en.wikipedia.org/wiki/D_%28programming_language%29 http://www.digitalmars.com/d/
As for the main features as listed in the Wikipedia article:
- some are mainly library issues or can be addressed as such, including associative and dynamic arrays (e.g. map and vector in STL) or array slicing (not sure if STL currently provides this, but it could),
- some of the mentioned improvements actually exist in C++ such as inner classes and the previously mentioned STL types,
- some things will be addressed in the planned C++0x standard (http://en.wikipedia.org/wiki/C%2B%2B0x), e.g. foreach-like loops, better metaprogamming including compile-time function execution, and anonymous functions, including nested ones and closures,
- some things I don't necessarily see as advantages for our purposes, e.g. automatic gargabe collection (on a (Pascal) language level, I think we should leave the choice to the user), or limited multiple inheritance (though I hardly need the full version myself, but if we choose to implement just the limited form, we can do it equally well in C++ and D, since it's a strict subset of the full version; but if we, now or later, decide we want the full form in some object model, C++ has it),
- for some things the description there is too vague, e.g. "reengineered template syntax" (the example doesn't tell me much -- however, if it's only about syntax, it doesn't matter much for a target language anyway) or the inline assembler (g++ has one, I have no idea if D's is better or worse -- mostly likely it won't be (CPU-)portable (quite hard for assembler code ;-), and apart from that, I see no major drawbacks in the g++ inline assembler (same as that in gcc and gpc), except maybe from syntax (irrelevant, see above); in particular it cooperates well with the optimizer which is not true of many other inline assemblers I've seen, in particular BP's, even for that tiny bit of optimization that BP has),
- so only a few things are left that are clearly advantages in my view, e.g. true modules and first class arrays. (Though modules are not such a big issue for a compiler-converter either; converting to C++ style "compilation units" and headers is not much more difficult, and the more tricky part -- processing modules in the correct order WRT dependencies -- has to be done in the frontend anyway; this is what "gp" currently does which could partly be reused as well.)
But most importantly of all (IMHO), D doesn't seem very widespread. Maybe this will change. However, since one root of our problems is really that Pascal isn't so widespread (anymore), I'm reluctant to base its renewal on another language whose long-time future is also in doubt AFAICS. Honestly, if I had to bet on the futue of D vs. C++0x, I'd take the latter, even if it's still in the standardization process, just because of the mass of existing C++ code and compilers. (And even if C++0x fails, the fallback to the existing C++ standard is soft, whereas a failure of D a few years from now when we have based our new compiler on it would be fatal.)
Of course, for manually (or semi-automatically) converting your Pascal programs to another language if GPC has no future, D might be a good choice.
Frank
On Thu, 2010-07-29 at 08:17 +0200, Frank Heckenbach wrote: [...]
But most importantly of all (IMHO), D doesn't seem very widespread. Maybe this will change.
This also raises an issue WRT cross-platform development. If you wish to compile your GPC code on another system, you would run into a dead end if that system does not have a D compiler. I would imagine that one would be much more likely to find a C++ compiler than a D compiler.
Thanks for your extensive consideration. I agree with your conclusion. Nevertheless:
Frank Heckenbach wrote:
- some things I don't necessarily see as advantages for our purposes, e.g. automatic gargabe collection (on a (Pascal) language level, I think we should leave the choice to the user),
D programmers can choose between automatic garbage collection or manual finalization.
or limited multiple inheritance (though I hardly need the full version myself, but if we choose to implement just the limited form, we can do it equally well in C++ and D, since it's a strict subset of the full version; but if we, now or later, decide we want the full form in some object model, C++ has it),
Limited multiple inheritance is rather seen as a feature than a deficiency. It excludes the mess that multiple inheritance can give, while interfaces and mixins provide the flexibility for which you otherwise would want to use multiple inheritance.
Cheers, Bastiaan.
__________ Information from ESET NOD32 Antivirus, version of virus signature database 5322 (20100729) __________
The message was checked by ESET NOD32 Antivirus.
Prof Abimbola Olowofoyeku (The African Chief) wrote:
On Thu, 2010-07-29 at 08:17 +0200, Frank Heckenbach wrote: [...]
But most importantly of all (IMHO), D doesn't seem very widespread. Maybe this will change.
This also raises an issue WRT cross-platform development. If you wish to compile your GPC code on another system, you would run into a dead end if that system does not have a D compiler. I would imagine that one would be much more likely to find a C++ compiler than a D compiler.
True, although DMD is already supported on 4 different platforms, and gdc probably supports even more. But we should not count the latter, as it has to cope with the same problems as gpc.
Bastiaan.
__________ Information from ESET NOD32 Antivirus, version of virus signature database 5322 (20100729) __________
The message was checked by ESET NOD32 Antivirus.
Bastiaan Veelo wrote:
Frank Heckenbach wrote:
- some things I don't necessarily see as advantages for our purposes, e.g. automatic gargabe collection (on a (Pascal) language level, I think we should leave the choice to the user),
D programmers can choose between automatic garbage collection or manual finalization.
As I understood it, automatic garbage collection is always active, just that objects that are manually disposed of don't take part in it. Or is there a way to completely disable GC (e.g., for real-time purposes)?
or limited multiple inheritance (though I hardly need the full version myself, but if we choose to implement just the limited form, we can do it equally well in C++ and D, since it's a strict subset of the full version; but if we, now or later, decide we want the full form in some object model, C++ has it),
Limited multiple inheritance is rather seen as a feature than a deficiency. It excludes the mess that multiple inheritance can give, while interfaces and mixins provide the flexibility for which you otherwise would want to use multiple inheritance.
I understand that, and if I'd design an object model, I'd likely go for it as well. However, my point was that as a target language, it is no advantage, even if the source language has such a model, since that can be mapped to both; it's a slight disadvantage in case the source language ever wants the full model. BTW, C++ full multiple inheritance is not such a mess -- there are rules and ways for the programmer to resolve the problematic cases, maybe not always easy, but at least well-defined. I read about it in the FAQ, though I've never needed it myself.
Frank
Frank Heckenbach wrote:
Bastiaan Veelo wrote:
Frank Heckenbach wrote:
- some things I don't necessarily see as advantages for our purposes, e.g. automatic gargabe collection (on a (Pascal) language level, I think we should leave the choice to the user),
D programmers can choose between automatic garbage collection or manual finalization.
As I understood it, automatic garbage collection is always active, just that objects that are manually disposed of don't take part in it. Or is there a way to completely disable GC (e.g., for real-time purposes)?
"It is also possible to disable garbage collection for individual objects, or even for the entire program if more control over memory management is desired." [http://en.wikibooks.org/wiki/D_Programming]
Bastiaan.
__________ Information from ESET NOD32 Antivirus, version of virus signature database 5322 (20100729) __________
The message was checked by ESET NOD32 Antivirus.
Bastiaan Veelo wrote:
Prof Abimbola Olowofoyeku (The African Chief) wrote:
On Thu, 2010-07-29 at 08:17 +0200, Frank Heckenbach wrote: [...]
But most importantly of all (IMHO), D doesn't seem very widespread. Maybe this will change.
This also raises an issue WRT cross-platform development. If you wish to compile your GPC code on another system, you would run into a dead end if that system does not have a D compiler. I would imagine that one would be much more likely to find a C++ compiler than a D compiler.
True, although DMD is already supported on 4 different platforms, and gdc probably supports even more. But we should not count the latter, as it has to cope with the same problems as gpc.
And the former is not free software (only the frontend).
Frank
Hi,
On 7/29/10, Prof Abimbola Olowofoyeku (The African Chief) chief@greatchief.plus.com wrote:
On Thu, 2010-07-29 at 08:17 +0200, Frank Heckenbach wrote: [...]
But most importantly of all (IMHO), D doesn't seem very widespread. Maybe this will change.
This also raises an issue WRT cross-platform development. If you wish to compile your GPC code on another system, you would run into a dead end if that system does not have a D compiler. I would imagine that one would be much more likely to find a C++ compiler than a D compiler.
Right. DJGPP has no D compiler (LDC? GDC? I don't know what it's called. Maybe there's two, I forget.) but does have C++ (now G++ 4.4.4). This has allowed several big projects to (at one time) be ported to DOS: p7zip, Dungeon Crawl: Stone Soup, paq8o8. And DOS is a weak platform (esp. in developer and user support). So if C++ is acceptable even there, that should give you hope! And OpenWatcom has improved C++ a lot too lately. ;-)
And speaking of D, a Befunge-98 interpreter (CCBI, by Mycology author) was written in it, but he only provides compiles of Linux and (grudgingly) Win32 and lacks something on one port due to a compiler bug still not fixed after several years. (Ah, no Unefunge or Trefunge support on Win32, boo freakin' hoo, heh.) http://users.tkk.fi/~mniemenm/befunge/ccbi.html
So I can't run (even under HX) the Win32 compile in DOS, but I can compile FBBI (C99) even though it's not technically as good (but original/"official", heh). And note that this is the only D app I knowingly use (and only barely, esp. since I have not bothered writing much B98 code yet).
Frank, I would be curious what you think of Ada or Modula-2, esp. since both of those have (official or unofficial) GCC support. And there are already (weak) converters from Pascal to Ada, Oberon, Modula-2, etc. (But no GNU Oberon, only OO2C.)
Rugxulo wrote:
Frank, I would be curious what you think of Ada or Modula-2, esp. since both of those have (official or unofficial) GCC support. And there are already (weak) converters from Pascal to Ada, Oberon, Modula-2, etc. (But no GNU Oberon, only OO2C.)
Since you ask me specifically: I've never programmed in any of those languages myself (outside of University), and I haven't followed their development in recent years.
Since most of these languages are not widely used, I wouldn't consider them attractive targets for a converter. Ada may be an exception, but it's still less widely used than C++, and at least I know C++ much better than Ada -- though if there's a number of Ada programmers here (whom I don't know yet :-) it might make it interesting to consider as a target.
Weak converters are IMHO of not much use, since turning them into full converters is probably as difficult, if not more so, than writing one from scratch. As the saying goes, the first 90% take the first 90% of time, the remaining 10% take the other 90%. In this case even worse, since e.g. parsing 90% of a language or implementing 90% of semantics can be drastically easier than (almost) 100%, if the problematic cases are simply left out (or mishandled). (Furthermore, I suspect they each support only one Pascal dialect, in contrast to GPC.)
As an example, the GPC frontend was originally devrived from the C frontend (note, I'm not talking about the backend issues here, but the actual frontend responsible for handling Pascal). Initially, this provided some success quickly, as many Pascal features map more or less roughly to C features (e.g. arrays to arrays, [variant] records to structs/unions, for/while/repeat loops to for/while/do loops, Integer to int, mod to %, procedures/functions to functions etc.). But the devil is in the details, and that's a damn big devil here (e.g., C arrays are 0-based, Pascal arrays don't have to, C arrays convert to pointers automatically, Pascal arrays don't, variant records have an invariant part (unlike unions) and special semantics (though many Pascal compilers, including BP, ignore them), Pascal for-loops evaluate their bounds once before execution, mod behaves differently in Pascal and C for negative arguments, function parameters in (esp. K&R) C are not checked, etc. etc.). Each of those points (and many more) have caused us much work in the frontend later on in order to fix them and for that it didn't help, but actually hurt, that the frontend handled them somehow, but wrong (i.e., in the C way), and without any warning or indication where to look for errors. E.g., simply calling a routine with a large number for an integer parameter would give no compile-time error, but wreak havoc at runtime with early GPC versions (until about 1997) because they would be passed as "LongInt" or so, the routine expected "Integer", no checks were done, and as the result, all "following" parameters (i.e., left of and, depending on endianness and stack direction, including) the actual one were read completely wrong in the routine). Also, for a long time the frontend carried a lot of ballast necessary in C, but not used in Pascal, until I spent quite some time cleaning (most of) it up. In retrospect, I suppose writing the frontend from scratch would have been less work in total.
Frank
Frank Heckenbach wrote:
some time cleaning (most of) it up. In retrospect, I suppose writing the frontend from scratch would have been less work in total.
Wouldn't this apply also to converting to C++ rather than to e.g. LLVM assembly ? C++ does seem attractive with all that's already there -- but you know beforehand that you will need tricks -- and you also know that tricks in the end "corrupt" software.
Regards,
Adriaan van Os
Hi,
On 7/30/10, Frank Heckenbach ih8mj@fjf.gnu.de wrote:
Rugxulo wrote:
Frank, I would be curious what you think of Ada or Modula-2, esp. since both of those have (official or unofficial) GCC support. And there are already (weak) converters from Pascal to Ada, Oberon, Modula-2, etc. (But no GNU Oberon, only OO2C.)
Since you ask me specifically: I've never programmed in any of those languages myself (outside of University), and I haven't followed their development in recent years.
Well, I only ask you because you brought all this up! ;-) Note that I don't know any of those languages myself, but they are mostly Wirth-ian like Pascal.
Since most of these languages are not widely used, I wouldn't consider them attractive targets for a converter.
Maybe not individually, but taken as a whole, all the Wirth-ian languages and their users can add up. And it seems many users use (or have used) more than one (except me, so far).
Ada may be an exception, but it's still less widely used than C++, and at least I know C++ much better than Ada -- though if there's a number of Ada programmers here (whom I don't know yet :-) it might make it interesting to consider as a target.
Well, Ada has been folded into the GCC tree since 3.2 or so. Even DJGPP supports it. No, it's not got huge support, but people do indeed use it. Honestly, I'm surprised (but glad) GCC still supports it. Perhaps they have lots of DoD volunteers, I dunno. ;-) Originally the DoD paid a lot to have it written, so I guess that didn't hurt either.
Note that GNU Modula-2 (EDIT: I've never tried) partially requires C++ (I think) to build due to exceptions. They are allegedly very mature now but development does focus on Linux (surprise surprise) with some lesser testing on Cygwin and Mac OS X.
I'm not sure what to think, I have mixed feelings. I'm glad for them, but I wonder why all these efforts have to take place separately and not cooperate more. Wirth's children should stick together. ;-)
Weak converters are IMHO of not much use, since turning them into full converters is probably as difficult, if not more so, than writing one from scratch. (Furthermore, I suspect they each support only one Pascal dialect, in contrast to GPC.)
Correct, and e.g. P2Ada didn't work for my simple program. I'm not saying a rewrite is bad (or even feasible), just saying that some effort had already been done. Maybe there's some reusable code or good ideas at least, I dunno. But this seems more of something for you (expert) to decide than me (noob). ;-)
As an example, the GPC frontend was originally derived from the C frontend (note, I'm not talking about the backend issues here, but the actual frontend responsible for handling Pascal). Initially, this provided some success quickly, as many Pascal features map more or less roughly to C features (e.g. arrays to arrays, [variant] records to structs/unions, for/while/repeat loops to for/while/do loops, Integer to int, mod to %, procedures/functions to functions etc.). But the devil is in the details, and that's a damn big devil here
I'm not knocking your decision, it worked quite well! GPC is a nice product, nicer than most software. I don't personally believe software is throwaway, so I do think it should be reused or preserved (if possible). However, I do understand hindsight. Even RMS complains that choosing Mach for Hurd was meant to save them time but probably hurt in the long run. Yes, ironically, sometimes it is easier to rewrite from scratch. But you need experience, and that usually only comes from modifying a pre-existing project.
On Friday, July 30, 2010 at 8:17, Frank Heckenbach wrote:
Since most of these languages are not widely used, I wouldn't consider them attractive targets for a converter. Ada may be an exception, but it's still less widely used than C++, and at least I know C++ much better than Ada....
This raises a point of interest to me. When GPC development stalled, I had to decide whether to continue with GPC as it was, switch to another Pascal compiler, or switch languages. I have programmed in Pascal for thirty years; it was my fifth language, after BASIC, Algol 60, FORTRAN IV, and SNOBOL, and of those, it is the only one in which I have continued to develop new programs. This was due primarily to the existance of GPC, and especially its good support for ISO 10206.
For a number of reasons, I elected to switch to Ada, as implemented by the GNU Ada compiler (GNAT). I had studied the language shortly after it was introduced in 1983 but did not use it seriously until a GCC version became available. I found the transition from Pascal to Ada to be quite easy, because many of the implementation concepts are similar. For several years, I've done all of my new work in Ada instead of Pascal.
I would think that translating from Pascal to Ada would have some advantages over C++. For example, Ada inherently supports scalar subrange types and range checking. I'm not that familiar with C++, but I don't believe that subranges are supported. So a Pascal to C++ translator would have to emit explicit code for range checks, whereas a Pascal to Ada translator could emit an Ada subrange type and let the Ada compiler handle range checking. The same consideration would pertain to array index checking and NIL-pointer dereferencing.
Other examples would be nested procedures, packed arrays and records, and string slice access, all of which are inherently supported by Ada. In short, I would expect that the closer mapping of Pascal to Ada constructs would result in a simpler translator than if the target language differed more significantly from Pascal. In many cases, the emitted Ada code could be nearly identical to the Pascal code.
GNAT is well supported by AdaCore Technologies, whose large customers (e.g., Boeing, Lockheed) drive compiler improvements and bug fixes back into the GCC code base. So the compiler is in good shape, and it maintains compatibility with the current GCC release. (One metric that I considered when switching from GPC to GNAT was that a packed array access that generated about 175 instructions in GPC generated only 15 for the equivalent code in GNAT.) GNAT is also well-supported by GDB.
Finally, I would think that a user base of Pascal programmers could more easily help to examine the Ada output of the translator for correctness than if the output were in C++.
On the other hand, as you say, if you know C++ much better than Ada, then you're likely to be more productive initially if the translator targets C++. Whether that would be offset by the additional work required, as noted above, is unknown.
Another point is that GNAT is pretty much the only freely-available Ada compiler, although it runs on a large variety of platforms. There appear to be several C++ compilers available, so perhaps targeting C++ would allow broader host-platform coverage.
Overall, I've been happy with the switch. I've used GPC for about 12 years, both for in-house programming and for embedded-microprocessor (M68K) work, and it's been very useful for both. Regardless of GPC's future, I appreciate the work that you, Waldek, Jan-Jaap, Peter, and Jukka have done, as well as that of the rest of the GPC contributors.
-- Dave
J. David Bryan wrote:
This raises a point of interest to me. When GPC development stalled, I had to decide whether to continue with GPC as it was, switch to another Pascal compiler, or switch languages. I have programmed in Pascal for thirty years; it was my fifth language, after BASIC, Algol 60, FORTRAN IV, and SNOBOL, and of those, it is the only one in which I have continued to develop new programs. This was due primarily to the existance of GPC, and especially its good support for ISO 10206.
For a number of reasons, I elected to switch to Ada, as implemented by the GNU Ada compiler (GNAT). I had studied the language shortly after it was introduced in 1983 but did not use it seriously until a GCC version became available. I found the transition from Pascal to Ada to be quite easy, because many of the implementation concepts are similar. For several years, I've done all of my new work in Ada instead of Pascal.
I would think that translating from Pascal to Ada would have some advantages over C++. For example, Ada inherently supports scalar subrange types and range checking. I'm not that familiar with C++, but I don't believe that subranges are supported.
No, not directly. However, range-checks are a rather minor issue -- for comparison, it took me a few days to implement them in GPC, while the object models took weeks or months (I'm not sure exactly how much work Peter and Waldek spent on them), exceptions implemented from scratch would take weeks, and templates probably much longer. So it's these "big things" that matter most. You might think checked subranges are more important because they're more fundamental (which they are), but they're just not so difficult to implement. For the 3 big things I mentioned, I know C++ has them already. I don't know if Ada does, perhaps you can give us some information here, e.g., is its object model comparable to those GPC supports or that of C++; does Ada supported exception handling and how; does it support something like templates (IOW, how can one implement, say, a generic list, applicable to any given type, with strong type-checking).
Other examples would be nested procedures, packed arrays and records, and string slice access, all of which are inherently supported by Ada.
Nested routines are the only serious lack in C++ I see so far. Packed arrays/records and array slices can be implemented in C++ code with moderate effort (much easier than the current TREE_NODE implementation in GPC, which is messy and inefficient, as you note below).
In short, I would expect that the closer mapping of Pascal to Ada constructs would result in a simpler translator than if the target language differed more significantly from Pascal. In many cases, the emitted Ada code could be nearly identical to the Pascal code.
I fear this might only be true for more traditional Pascal, while I stated that I'm particularly interested in more "modern" features that I know from C++. Of course, if Ada also supports them, it would make it more interesting.
GNAT is well supported by AdaCore Technologies, whose large customers (e.g., Boeing, Lockheed) drive compiler improvements and bug fixes back into the GCC code base. So the compiler is in good shape, and it maintains compatibility with the current GCC release. (One metric that I considered when switching from GPC to GNAT was that a packed array access that generated about 175 instructions in GPC generated only 15 for the equivalent code in GNAT.) GNAT is also well-supported by GDB.
I don't doubt any of that, and for manually porting your code, Ada was probably the better choice than C++ for you, and might be for many other users.
Kevan Hashemi wrote:
I agree that a Pascal to C++ translater would be just fine. But on the other hand, that means that we need people who enjoy programming in both Pascal and C++ to support the project. Anyone who fits that description is not going to have much use for the product, because they might as well program in C++ in the first place.
That may be the case. So far it looks I might be almost the only one, and in this case it's probably indeed easiest if I just port my programs to C++.
The people who are dedicated to GPC are people who greatly prefer to program in Pascal. Most such people dislike programming in C++.
I'd speculate that most of them actually don't know C++ too well (neither did I until a few years ago). C has a bad name here, partly for good reasons, and partly for historical competition. C++ seems to inherit this bad name mostly because of its similar name, indeed. Yes, it contains all of C's misfeatures, but as I said before, so does GPC in the form of "BP extensions". Let's do a quick check (note: not all of the listed points I consider pure misfeatures, some of them have their use, though I think they're often overused, in both languages, when better alternatives are available):
- Untyped pointers: check
- Untyped memory operations (e.g., memory copy, fill): check
- Untyped files: check
- Arbitrary type casts: check
- Hidden type casts by abusing unions (variant records): check
(Plus hidden type casts by "absolute" variables, a BP misfeature that even C doesn't have.)
- Zero-based, manually memory-managed strings with clumsy support functions: check ("PChar" and the "Strings" unit)
- Mixup between arrays and pointers: check (if only for "PChar")
- Always-0-based arrays: check ("open array" parameters)
- Unchecked array ranges: check (by turning off range-checking, which BP allows locally, and which is sometimes used intentionally for "dirty stuff")
- goto: check (even in standard Pascal)
So, yes, one can write dirty C-style programs in C++, but so can one do in BP and therefore GPC. However, high-level C++ looks quite different. E.g., there's a string type (which is internally a template in C++, while it's a compiler built-in in GPC, which is a difference mostly invisible to the programmer) which similar string operations as in Pascal (from "+" to "substr" to various find operations), range-safe and with dynamic memory management (so you can have strings of unbounded length, which GPC can't do yet -- at best you can have "^String", but then you have to manage its allocation yourself).
The C++ object model is a little higher-level than the various Pascal models (I explained the automatic con-/destructors which are easier to use and less error-prone than Pascal's explicit calls).
And of course, a simple list (using the STL template) is no comparison to a hand-made list in Pascal (or a general list abusing type-casts or other dirty tricks).
The only major drawback I see is the lack of modules.
Of course, the syntax is quite different from Pascal (and quite similar to C), but for me that's just not an issue -- it just looks different, and I can't even say GPC's syntax is much cleaner in total (with all the BP mess mixed in).
So who is going to write this translator? Not me. Is Frank going to write it on his own?
I'd like to, but someone would have to pay my bills for the next several years. ;-)
It may be ten times as much work to re-write the compiler in Pascal, but we may find that we have a hundred times as much developer time available.
One way to proceed is for Frank to estimate how many developer hours are required for a Pascal compiler in Pascal, and for a Pascal to C++ translator, then we poll the list to see how many hours people are prepared to dedicate to each project. If only one of them gets enough hours, then we have only one practical solution.
What exactly do you mean by "Pascal compiler in Pascal"? What would its output be?
John L. Ries wrote:
The more I think about it, the clearer it is to me that the Pascal to C++ translator is the option discussed that I like the least. I actually would prefer a Pascal to Ada translator more than Pascal to C++, because if I had to abandon Pascal, I think I'd be much more comfortable programming in Ada than in C/C++
That's not quite what I was talking about. I meant a converter as a compiler replacement, i.e. you would write your source in Pascal, and to build it, convert to C++ and compile it (so C++ is just a mostly invisible intermediate step, like assembler code is now).
You seem to mean a permanant conversion to another language, so you'd maintain your code in the new language afterwards. For this purpose, Ada or Modula-2 might be more suitable for many Pascal programmers. (For me probably it would probably still be C++, if only because I know it better now, and such code could integrate better with my other recently-written C++ code. But for this purpose, I wouldn't write a complete converter -- it's not worth the effort -- maybe a small tool for the most boring parts of the conversion, perhaps just a sed script for some obvious syntax changes.)
Frank
Rugxulo wrote:
Since most of these languages are not widely used, I wouldn't consider them attractive targets for a converter.
Maybe not individually, but taken as a whole, all the Wirth-ian languages and their users can add up. And it seems many users use (or have used) more than one (except me, so far).
I don't understand your point here. A converter would target one language, not all of them combined somehow. So we'd have to decide on one of them and depend on its future. (BTW, Eike's suggestion of writing a collection of converters for different languages, is interesting in a theoretical sense, but practially, I don't see how it would help, rather than cause even more work.)
I'm not knocking your decision, it worked quite well! GPC is a nice product, nicer than most software. I don't personally believe software is throwaway, so I do think it should be reused or preserved (if possible).
Neither do I, and in fact I have programs that are decades old that I've ported between several languages and target systems (and might now have to port once more if GPC actually dies).
But the fact is that GPC needs a serious amount of maintenance right now and in the foreseeable future, and this work isn't going to do itself (until we develop a good AI ;-).
However, I do understand hindsight. Even RMS complains that choosing Mach for Hurd was meant to save them time but probably hurt in the long run. Yes, ironically, sometimes it is easier to rewrite from scratch. But you need experience, and that usually only comes from modifying a pre-existing project.
Yes, that's my point. Again, I don't blame Jukka, neither for choosing the gcc backend, nor for starting from the C frontend, though in hindsight both decisions are now causing us problems, and these problems are real.
Kevan Hashemi wrote:
Dear Frank,
Yes, it's possible, but it's IMHO not conformtable.
Okay, I think I understand the extent of the problems with re-writing the GPC C-code in Pascal, and I appreciate that there may not be enough development hours available.
I am going to make sure that I have a multi-platform Pascal compiler, even if I have to re-write the entire thing myself. I agree with Pascal Viandier: GPC is by far the best Pascal compiler I have ever used, and I have no intention of taking a step backwards to C++ or another Pascal compiler.
If we create a self-compiling compiler, do we need our own assembler written in Pascal as well? Or can we produce GCC-style objects and use any GCC linker and assember to produce our executable?
I think one could produce GCC-style assembler and use the GNU assembler and linker. But I'd strongly advise against that. Assembler code is, of course, CPU dependent, so you'd need a code generator for each target. What's more, the assembler doesn't optimize, so you'd need to do all the optimization yourself (which is a huge task, probably larger than the compiler itself, if you want good optimization of modern standards). So if you're going to output assembler code, I'd suggest LLVM (though I don't know it myself, AIUI it's machine independent, and LLVM tools can do optimization on it). Though I still think a high-level language is a better target (e.g., C++ because it provides objects, exceptions and templates already which you'd otherwise have to reimplement, which is extra work and probably wouldn't be compatible).
What is it in the GCC interface with Pascal that keeps changing?
The interface between the backend and the frontend. Not so much the changes in function parameters etc. (these are easy because the compiler catches them), but e.g. the expected data structures changed drastically, internal systems like memory management changed which both required larger changes in GPC, and the behaviour of the backend changed in subtle ways which is probably the reason for some of the non-trivial issues I've had with gcc-4 based GPC with my code. (Perhaps the backend just did more aggressive optimizations which the GPC frontend isn't prepared for, resulting in miscompiled code.)
Is it the object format?
The backend and frontend are linked together in the compiler excutable (gpc1, called by gpc). No files are exchanged between them, but data structures in memory (which, of course, allow more flexibility and therefore more incompatible changes, as we've seen).
The backend outputs assembler files, which the assembler translates to object files from which the linker produces executables and libraries.
You give one example to do with restricted language flags in your report. Perhaps you could give us some more examples.
If you're interested, I suggest you read the "Internals" chapter in GPC's documentation. However, I'm not sure these kinds of things are important WRT a rewrite since they would all be replaced by clean data structures, e.g., an abstract "declaration" object class, with derived classes for variable, routine etc. declarations, each having exactly those fields that it needs, with no need for general-purpose "language flags", or for runtime type checks which GCC/GPC often does, e.g. to ensure that a tree node is in fact, say, a type declaration when it should be one (whereas a rewritten compiler would, of course, just take a formal parameter of type "TTypeDeclaration" or whatever, thus ensuring type-correctness at compiler-build-time).
Rugxulo wrote:
... so much effort is required which there's noone there to do. :-(
In other words, bugs that you yourself can't fix? I don't have any huge complaints with GCC 3.4.4,
Me neither so far, but I do with gcc-4.x based GPC.
that's why I'm wondering why it deathly matters to have latest and greatest GCC support (besides the obligatory "it would be nice", e.g. Atom support or plugins in 4.5.0).
Some platforms (in particular Mac) regularly require newer backend versions because their support in older ones was too buggy or nonexistent.
Bugs found in gcc-3.x won't be fixed by the GCC developers, so we'd have to maintain that backend outselves in the long run (which I certainly wouldn't want to). Also, of course, newer developments (such as better optimization) in gcc-4 wouldn't be available to us then.
big runtimes and thus big .EXEs (so?),
Not a problem for most cases, sometimes a problem for embedded systems etc. and for people with a 1980s mindset ("oh shock, a few megabytes executable"). ;-)
It's more of when you write a very simple tool that takes 300 KiB. It just feels wrong, esp. from an "assembly" mindset or if you're spoiled by smartlinkers in other Pascal compilers. I'll admit, Frank, to be frank, ;-) in real life it doesn't matter in 99% of cases, but I do think libgpc.a could be modularized a bit better. (No, I haven't looked closely yet, my bad! And BTW, I strongly suspect GNU ld's --gc-sections doesn't work with COFF, blech.) I mean, no offense, but when a combined DOS 8086 (TP55) + Win32 .EXE (VP21) takes 24 KiB uncompressed .... ;-)
As I said. Sure, one could do something about it (smart linking in general), but, well, someone would have to do it. Like with many other issues -- I don't really lack ideas what could be improved in GPC in many ways, what we're lacking is developers who will actually do it.
I don't want to get too philosophical, but I think a reason for its decline was that in several ways it just was too strict. E.g., while I dislike "goto" as much as Dijkstra did, I'm not so much opposed to "Exit" (which some dismiss as a disguised goto, and ISO Pascal doesn't have).
"Exit" as in "break" out of loop? I don't even think TP had it until v6 or (more likely) v7. Also note that some languages (Oberon, Java) don't support goto at all !! And FPC only handles local gotos.
Yes. AFAIR, TP had it earlier. As for non-local gotos, I use them in very few places in my code, and these cases would be better served with exceptions (e.g., a la C++).
(In fact, it might have been better to leave I/O completely out of the language, like C did, and let library authors develop it
Didn't Modula-2 do that, much to many people's chagrin??
I don't know about Modula-2 (why didn't Modula-2 programmers just write a "standard" I/O library?), but I can hardly imagine the situation is worse that what we have in Pascal now -- two mostly incompatible models (standard and BP), both not powerful enough for many things.
Same with the lack of an official, and therefore standardized way, to interface with foreign-language libraries, a common necessity in real-world programs. (Sure, you can reinvent every wheel, i.e. reimplement every library in Pascal, but that's not productive.)
Ada has Interface (or whatever it's called). Doesn't matter anyways as there are so many competing formats (ELF, COFF, Mach-o) and linkers that work in varying degrees and different ABIs. (Agner's ObjConv potentially helps here, but I've never heavily used it.)
If it depends on the object format, it's not standardized IMHO. I mean something like GPC's "external name", together with guaranteed C-compatible types (as a minimum, so any C funtion declaration can be translated manually), or of course, as a maximum, a fully-automatic way to get foreign-language declarations (such as C++'s 'extern "C"', which is, of course, easy since C++ almost contains C as a subset).
So that's perhaps why BP was popular under Dos, because it was one fixed dialect (so diverging extensions, though massively present, and their long-time consequences, were not known to the majority of programmers),
Except it also extended itself several times! So code that works for TP55 (objects, units) won't work in TP3, nor code in TP6 (inline asm).
But it was backward-compatible between its versions.
Plus bugs and heavy 16-bitisms.
That's what I meant "were not known to the majority of programmers". The most famus bug (Delay runtime error 200) occurred on faster CPUs that didn't become popular until BP was mostly abandoned, and Dos programmers didn't care about 16-bitisms and worse (direct access to fixed memory addresses of Dos etc.). Portability was basically unheard of.
Doesn't mean lots of good stuff wasn't written in TP/BP (e.g. Chasm: The Rift), but most of that old code is pretty unmaintainable without the exact same compiler version (ahem, TPU incompatibilities).
I won't blame Borland for changes in binary intermediate files (TPU). GPC changed the GPI format several times as well, and as a free software supporter, I care more about source compatibility.
I/O was extended to be at least suitable for Dos (though it maps less well to other systems which were ignored by most Dos programmers)
Since TP didn't run on anything else (CP/M dropped after v3 and only two TPW releases), that's no surprise, esp. since they never fully supported ISO 7185 or 10206. It's hard to be portable when you ignore standards. And yet most compilers nowadays (even on "modern" OSes, heh) try to emulate BP-style, oddly enough.
You echo my point. Back in the Dos days, it was an advantage for BP because BP programmers obviously didn't need anything but Dos I/O (and generally didn't care much what existed elsewhere). Nowadays, it's a liability.
and supported modules (units) which allowed some other needed facilities (such as CRT) to be supplied. But all of this was too short-sighted:
Everything in computers is short-sighted. We're constantly being bit by it.
But some things more than others ...
We now have a mess of dialects;
No worse than all those silly Romance languages. ;-)
How good that English has only one, universally accepted dialect. :-)
Dos-style I/O is too limited on modern systems; even CRT (one of the least bad designed BP units IMHO) wasn't as lasting as its C roughly-counterpart curses).
C didn't have curses built-in anyways.
I didn't mean to imply this, neither is CRT built-in in the BP language.
In fact, it left a lot out, hence POSIX (which I guess has its own dialects, e.g. 2008). Nobody bothers with pure ANSI C anymore (sadly), which is more painful when using non-GCC compilers (like OpenWatcom).
I don't know about everybody, but I try to use only standard C/C++ features. Of course, I compile with gcc only, but I enable a lot of warnings, and fix what they complain about. Apart from that, assuming you're talking about free software projects, if you want them to work with other compilers and they don't (without good reason), do something about it. That's the point of free software -- find out what doesn't compile, change it, and send a patch to the developers. Just complaining "I want this and that" won't help. (As a gross example, consider that Linus Torvalds, in the early days of Linux, claimed it would never run on anything but x86 and never be modular. Today it is modular and runs on many platforms. How did this happen? Not because people complained to him and pleaded him to do it, but because some who actually had the need wrote the code. After it was written and in good quality it was integrated, and now it's part of the official kernel.)
Heck, I know I'm on unsympathetic ears here, but I often think even Linux is too much of a moving target sometimes.
Depends on what features you use. Most of my GPC programs compile unchanged on an old libc5(!) based system (from 1996) and modern systems, including graphics (X11 and most of OpenGL -- though I did the latter only in C++). Of course, GPC's autoconf for the run time system covers some small differences, but there have been relatively few incompatible changes in this regard (e.g. pty handling -- and I even get both forms of it working in parallel with some manual setup). I think the bigger issues are things like directory structure (file hierarchiy standard), but that's more issues for sysadmins (including personal users) than for programmers.
the current GPC users seem to be rather diverse (WRT dialects, platforms and features used etc.), which also doesn't bode well for a new project.
Like I said, your favorite dialect seems to be ISO 10206. (I saw you praising it in some old mail in list archive.)
Partly. It still carries the limited I/O system and doesn't provide low-level features (I'm not saying you should use them often, but sometimes you have to), or external library interfaces. So my favourite dialect is actually some subset of the GPC dialect ...
So you should probably just use that as a testbed for your C++ idea.
Well, no. As I said, I feel the lack of features like templates and automatic destructors, so if I was to work on something new, it should include those features.
Frank
Adriaan van Os wrote:
Frank Heckenbach wrote:
some time cleaning (most of) it up. In retrospect, I suppose writing the frontend from scratch would have been less work in total.
Wouldn't this apply also to converting to C++ rather than to e.g. LLVM assembly ? C++ does seem attractive with all that's already there -- but you know beforehand that you will need tricks -- and you also know that tricks in the end "corrupt" software.
My point was about "tricks" (I'd rather say kludges) in the compiler/converter itself. Tricks in the output code, while surely not nice, are IMHO less problematic, since the converter code that outputs them can still be written cleanly (and changed when the tricks become unnecessary).
So far, the only trick I know we'd need is for the trampoline case (and AFAIK, C++0x would eliminate the need for a trick here).
Frank
On Saturday, July 31, 2010 at 6:01, Frank Heckenbach wrote:
For the 3 big things I mentioned, I know C++ has them already. I don't know if Ada does, perhaps you can give us some information here, e.g., is its object model comparable to those GPC supports or that of C++;
I am not familiar with objects in GPC, and only tangentially familiar with C++ (i.e., have not done any serious programming in it). However, the short answer is yes; objects are called "tagged types" in Ada.
There is a guide that discusses comparable Ada and C++ features here:
http://www.adahome.com/Ammo/Cplpl2Ada.html
or more nicely formatted here:
http://home.agh.edu.pl/~jpi/download/ada/guide-c2ada.pdf
It has a section devoted to the Ada object model:
http://www.adahome.com/Ammo/Cplpl2Ada.html#3
(Note that this guide discusses Ada 95, which superceded the original Ada 83, and has itself been superceded by Ada 2005. As such, it is a bit out of date, although language revisions have been kept backward-compatible. Ada undergoes periodic improvements; the next language revision is scheduled for 2012.)
does Ada supported exception handling and how;
Yes; see:
http://www.adahome.com/Ammo/Cplpl2Ada.html#1.2.7
does it support something like templates (IOW, how can one implement, say, a generic list, applicable to any given type, with strong type-checking).
Yes; templates are called "generics" in Ada. See:
http://www.adahome.com/Ammo/Cplpl2Ada.html#4
Packed arrays/records and array slices can be implemented in C++ code with moderate effort....
Indeed, any language may be used as the intermediate for GPC. However, I would expect that the effort required to write a translator would vary, depending on the closeness of the mapping between GPC and that language. My impression is that C++ doesn't map particularly well to Pascal's type structure (subranges, sets, packed structures), and while that may be overcome with moderate effort, as you say, that effort may be less with a different choice of intermediate language. Granted, though, that this mapping is just one of several variables influencing that choice.
I fear this might only be true for more traditional Pascal, while I stated that I'm particularly interested in more "modern" features that I know from C++. Of course, if Ada also supports them, it would make it more interesting.
"Modern" features, such as templates and exception handling, have been present in Ada from the beginning (1983). :-)
-- Dave
J. David Bryan wrote:
On Saturday, July 31, 2010 at 6:01, Frank Heckenbach wrote:
For the 3 big things I mentioned, I know C++ has them already. I don't know if Ada does, perhaps you can give us some information here, e.g., is its object model comparable to those GPC supports or that of C++;
I am not familiar with objects in GPC, and only tangentially familiar with C++ (i.e., have not done any serious programming in it). However, the short answer is yes; objects are called "tagged types" in Ada.
There is a guide that discusses comparable Ada and C++ features here:
http://www.adahome.com/Ammo/Cplpl2Ada.html
or more nicely formatted here:
http://home.agh.edu.pl/~jpi/download/ada/guide-c2ada.pdf
It has a section devoted to the Ada object model:
http://www.adahome.com/Ammo/Cplpl2Ada.html#3
(Note that this guide discusses Ada 95, which superceded the original Ada 83, and has itself been superceded by Ada 2005. As such, it is a bit out of date, although language revisions have been kept backward-compatible. Ada undergoes periodic improvements; the next language revision is scheduled for 2012.)
does Ada supported exception handling and how;
Yes; see:
http://www.adahome.com/Ammo/Cplpl2Ada.html#1.2.7
does it support something like templates (IOW, how can one implement, say, a generic list, applicable to any given type, with strong type-checking).
Yes; templates are called "generics" in Ada. See:
OK, looks quite good actually. (Though their example and comparison WRT exceptions is misleading -- they emulate Ada's name-based selection in C++ and complain that's it less "safe" (???), whereas the common thing to do in C++ is type-based selection (which I prefer, since it allows you to add extra information to exceptions in the form of data fields), which is just as safe, and apparently not available in Ada, though it probably can be emulated somehow.)
Also, I notice that Ada is quite a bit more verbose. While I'm no fan on "line noise like" languages such as Perl, I also dislike excessive verbosity (which I roughly define as the amount of tokens that add no new information -- either because they repeat previous information, or they're just syntactically required for the sake of it, a Pascal example would be "to begin do", 3 keywords for a single thing). Also Ada requires explicit instantiation of generics, while C++ instantiates templates automatically, etc. Of course, for a converter target these is no major issues.
Indeed, any language may be used as the intermediate for GPC. However, I would expect that the effort required to write a translator would vary, depending on the closeness of the mapping between GPC and that language. My impression is that C++ doesn't map particularly well to Pascal's type structure (subranges, sets, packed structures), and while that may be overcome with moderate effort, as you say, that effort may be less with a different choice of intermediate language. Granted, though, that this mapping is just one of several variables influencing that choice.
That's the point. I think in the big picture, it just doesn't matter much (especially given that most of the support could be written on the C++ level, probably even on the Pascal level later, without performance loss due to inlining, and doesn't have to be built into the compiler as it's now).
But while reading the above links, I also noticed that Ada is very strict about types etc. (no surprise given its main fields of use). While Pascal itself is so, too, GPC actually has to support some dialect and extensions that are not. As I wrote elsewhere, it basically covers everything that's in C WRT low-level features. Though you may consider this dirty, the fact is that a converter which is to be compatible to the existing GPC must support them. Is this possibke in Ada?
Frank
On Tuesday, August 3, 2010 at 3:42, Frank Heckenbach wrote:
But while reading the above links, I also noticed that Ada is very strict about types etc.
Ada allows conversion between otherwise incompatible types with the generic function "Unchecked_Conversion", which is effectively a type cast. See:
http://www.adahome.com/Ammo/Cplpl2Ada.html#1.3.3
(The name "Unchecked_Conversion" is verbose -- intentionally so -- as it's intended to call the reader's attention to the circumvention of strict type- checking.)
-- Dave
But there is no such thing like Objective-Ada, is there ? That would be a clear disadvantage of Ada as an intermediate language.
Regards,
Adriaan van Os
J. David Bryan wrote:
On Tuesday, August 3, 2010 at 3:42, Frank Heckenbach wrote:
But while reading the above links, I also noticed that Ada is very strict about types etc.
Ada allows conversion between otherwise incompatible types with the generic function "Unchecked_Conversion", which is effectively a type cast. See:
The section is a bit brief. Can it also be used, say, to convert any data type to an "array of char" (in C/Pascal notation) and back, to do "untyped memory block" operations?
(The name "Unchecked_Conversion" is verbose -- intentionally so -- as it's intended to call the reader's attention to the circumvention of strict type- checking.)
For the same reason C++ has the longish "reinterpretation_cast" (though, of course, for backward-compatibility it also has to support the more dangerous C form).
Adriaan van Os wrote:
But there is no such thing like Objective-Ada, is there ? That would be a clear disadvantage of Ada as an intermediate language.
As he mentioned before, Ada's "tagged records" seem to correspond to objects (though I haven't compared in detail).
Frank
On Thursday, August 5, 2010 at 9:51, Frank Heckenbach wrote:
The section is a bit brief.
A more complete discussion of Ada for programmers experienced with other languages is here:
http://www.adaic.org/docs/distilled/adadistilled.pdf
This may prove helpful in understanding features, although again it is based on the older 1995 Ada revision.
The current (2005) Ada Language Reference Manual is available here:
http://www.adaic.com/standards/ada05.html
...although the LRM is a precise description of the language and is less accessible than a tutorial (though it is far more readable than ISO 10206!).
Can it also be used, say, to convert any data type to an "array of char" (in C/Pascal notation) and back, to do "untyped memory block" operations?
There are no untyped values in Ada :-), but it can convert to an equivalent array of bytes or words, for example. Basically, Unchecked_Conversion can convert idempotently between any two objects of the same size, and will convert with implementation-defined results between objects of differing sizes.
Adriaan van Os wrote:
But there is no such thing like Objective-Ada, is there ? That would be a clear disadvantage of Ada as an intermediate language.
There's no such thing, because Ada has had OOP features (inheritance, polymorphism, abstract procedures and functions, dynamic dispatching) since the 1995 revision.
As he mentioned before, Ada's "tagged records" seem to correspond to objects (though I haven't compared in detail).
The examples in Section 9 of "Ada Distilled" may help to clarify this.
-- Dave
Adriaan van Os wrote:
But there is no such thing like Objective-Ada, is there ? That would be a clear disadvantage of Ada as an intermediate language.
There's no such thing, because Ada has had OOP features (inheritance, polymorphism, abstract procedures and functions, dynamic dispatching) since the 1995 revision.
Are there e.g. Ada GNUStep and Cocoa bindings ? I can't find them on the web.
Regards,
Adriaan van Os
On 06 Aug 2010, at 06:57, J. David Bryan wrote:
Adriaan van Os wrote:
But there is no such thing like Objective-Ada, is there ? That would be a clear disadvantage of Ada as an intermediate language.
There's no such thing, because Ada has had OOP features (inheritance, polymorphism, abstract procedures and functions, dynamic dispatching) since the 1995 revision.
I think Adriaan was referring to an extension of the Ada language that enables direct interfacing with Objective-C code (including defining subclasses, categories, ...), similar Objective-C++, Objective-Modula2 and Objective-Pascal.
Jonas
Jonas Maebe wrote:
On 06 Aug 2010, at 06:57, J. David Bryan wrote:
Adriaan van Os wrote:
But there is no such thing like Objective-Ada, is there ? That would be a clear disadvantage of Ada as an intermediate language.
There's no such thing, because Ada has had OOP features (inheritance, polymorphism, abstract procedures and functions, dynamic dispatching) since the 1995 revision.
I think Adriaan was referring to an extension of the Ada language that enables direct interfacing with Objective-C code (including defining subclasses, categories, ...), similar Objective-C++, Objective-Modula2 and Objective-Pascal.
See e.g. http://wiki.freepascal.org/FPC_PasCocoa.
Regards,
Adriaan van Os
On Friday, August 6, 2010 at 9:16, Adriaan van Os wrote:
Are there e.g. Ada GNUStep and Cocoa bindings ? I can't find them on the web.
I know absolutely nothing about the Mac or its interfacing requirements, but there appears to be a project here:
http://code.google.com/p/cocoa-gnat/
...that might be applicable.
On Friday, August 6, 2010 at 9:40, Jonas Maebe wrote:
I think Adriaan was referring to an extension of the Ada language that enables direct interfacing with Objective-C code (including defining subclasses, categories, ...), similar Objective-C++, Objective-Modula2 and Objective-Pascal.
OK, thanks, I did misunderstand that. I doubt that an extended version of Ada would exist that added language features to facilitate interfacing to a specific platform API. Ada does have support for interfacing to other languages (C, C++, COBOL, and Fortran), although this is in the form of pragmas that control data layout, e.g., row-major vs. column-major array order, and packages that provide types and procedures/functions compatible with the target language. The above project appears to be an interface package (called a "binding" in Ada) of this form.
See, for example:
http://www.adaic.org/standards/05rm/html/RM-B-1.html
and:
http://www.adaic.org/standards/05rm/html/RM-B-3.html
-- Dave
J. David Bryan wrote:
I know absolutely nothing about the Mac
I assume you genuinely don't know the background and history of the *Step/Cocoa API. It goes back to NeXT and was called NextStep. Later NeXT and SUN worked together to turn it into an open specification called OpenStep and this was implemented on all major Unix OSes. At that point the FSF started GNUstep as an open source implementation. Eventually Apple acquired NeXT and evolved the NextStep code base into Cocoa. This is now the primary (but not the only) GUI API on Mac OS X. In addition, it is the only GUI API on iPhone/iPad. There is also a Windows implementation called Cocotron.
In terms of installed base, its primary platform is the iPhone.
or its interfacing requirements, but there appears to be a project here:
http://code.google.com/p/cocoa-gnat/
...that might be applicable.
Ada does have support for interfacing to other languages (C, C++, COBOL, and Fortran), although this is in the form of pragmas that control data layout, e.g., row-major vs. column-major array order, and packages that provide types and procedures/functions compatible with the target language. The above project appears to be an interface package (called a "binding" in Ada) of this form.
See, for example:
http://www.adaic.org/standards/05rm/html/RM-B-1.html http://www.adaic.org/standards/05rm/html/RM-B-3.html
The trouble is that Objective-C is nothing like C, C++, COBOL and Fortran. When it comes to APIs, Objective-C is like Smalltalk.
Most Cocoa/GNUstep interfaces appear to be designed with a mindset of "This is just another C API" whereas they probably should be designed with a mindset of "I am going to have to interface to Smalltalk".
To those unfamiliar with Smalltalk it may help to think "This is like interfacing to SQL".
When you interface to SQL, you either allow plain SQL to be used verbatim and then send it off to an SQL engine as is, or you build a native interface layer that generates plain SQL and then send that off to an SQL engine.
What you probably do not want to do is build an interface that hooks into the SQL engine via a low-level C API that happens to be the one that the SQL engine uses internally simply because its written in C.
Unfortunately, the latter is the situation you encounter when interfacing to Objective-C. Like SQL, the it is a domain specific language that operates on a very high abstraction level. But unlike SQL, the Objective-C runtime does not have any (official) textual interface that allows you to send plain Objective-C statements to it to have them executed on the fly.
This means an implementor of an interface to Objective-C not only has to bridge the interface gap between the calling language and Objective-C, but they also have to bridge the abstraction gap between the high level of abstraction provided by Objective-C and the low level at which they must hook into the Objective-C runtime.
Unfortunately, the "just another C API" mindset often leads to an interface that doesn't bridge the abstraction gap and using the Objective-C library in the interfacing language is nothing like using it directly from Objective-C.
Not only that, but all too often, using an Objective-C library via such an interface is also an inferior experience compared to using a native library of the interfacing language.
It is no wonder then that those who use Cocoa/GNUstep natively like it because they see it the way it was designed and those who do not use it natively often don't like it because they see it through a "crippled" interface.
Many Objective-C developers have been in the latter camp before they started using Objective-C. Many of those "accidental" Objective-C developers often do not like the C side of the language much. Some of them then move on to design their own Objective languages, such as Objective-J, Objective-Lua, Objective Modula-2 etc etc etc.
Those designs not only do the interfacing, but they also bridge the abstraction gap. The simplest way to do that is of course to adopt the Objective-C syntax for message passing verbatim or at least something that resembles it very closely. For example Objective Modula-2 provides both Objective-C and Smalltalk message passing syntax.
However, bridging the abstraction gap doesn't necessarily require the adoption of Smalltalk or Objective-C syntax. For example, Closure Common Lisp has an Objective-C bridge with its own Lisp based syntax that is every bit as highly abstracted as Smalltalk and Objective-C. Maybe this comes down to the flexibility of Lisp and doing the same in other languages without adding special syntax is probably more difficult.
Still, when I look at cocoa-gnat, I am getting the impression that it leaves a lot to be desired in terms of bridging the abstraction gap, and that applies to both the thin and the thick binding.
J. David Bryan wrote:
OK, thanks, I did misunderstand that. I doubt that an extended version of Ada would exist that added language features to facilitate interfacing to a specific platform API.
This is not about "language features to facilitate interfacing to a specific platform API" but about interfacing to Objective-C. The interfaces of GNUStep for example are also written in Objective-C.
I detest Objective-C, but we have to cope with the fact that software is written in it. The same problem exists for headers written in C++.
A. van Os
On Monday, August 9, 2010 at 2:05, Objective Modula-2 wrote:
I assume you genuinely don't know the background and history of the *Step/Cocoa API.
That's correct. Thanks for the detailed explanation.
Those designs not only do the interfacing, but they also bridge the abstraction gap. The simplest way to do that is of course to adopt the Objective-C syntax for message passing verbatim or at least something that resembles it very closely. For example Objective Modula-2 provides both Objective-C and Smalltalk message passing syntax.
I'm still not clear on Adriaan's original point, though. Does GPC currently support Objective-C message passing syntax, such that having such support in any intermediate language utilized by a revised GPC compiler would be crucial?
(I've used GPC solely for ISO 10206 support. I've not used nor investigated any of the object features provided.)
-- Dave
J. David Bryan wrote:
I'm still not clear on Adriaan's original point, though. Does GPC currently support Objective-C message passing syntax,
I don't think it does.
such that having such support in any intermediate language utilized by a revised GPC compiler would be crucial?
I wouldn't say it is crucial in the sense that it is a requirement to support interfacing to Objective-C. However, if the target language you translate to supports Objective-C interfacing already and does so well, then you can save yourself the effort translating all the way down to the low-level C API of the Objective-C runtime. You can translate to the (hopefully) higher level statements of the target language and let the target language's compiler do the heavy work for you.
For example, when using Objective-C style message passing syntax in Objective Modula-2 ...
VAR foo : NSArray := [[NSArray alloc] init] ;
for the ObjC target, the above only needs to be translated to
NSArray *foo = [[NSArray alloc] init ;
which is very straight forward. However, for an C target, there is a whole lot more to be done to translate just this one line.
First the translator needs to determine that the receiver NSArray is a class and the message alloc represents an instance instantiation, then determine the allocation size, then call the ObjC runtime API
id class_createInstance(Class cls, size_t extraBytes)
Then the translator needs to obtain a so called selector for "init" then determine which type of message sending function applies (there are four) and call the ObjC runtime API
id objc_msgSend(id theReceiver, SEL theSelector, ...)
all of which requires considerable boilerplate code to be generated. If there are messages that pass parameters, the boilerplate code gets still more elaborate.
Adriaan is probably concerned that if the target language doesn't do this work already, then whoever maintains GPC is probably less likely to want to do the work to add support for an Objective-C interface.
Objective Modula-2 wrote:
J. David Bryan wrote:
I'm still not clear on Adriaan's original point, though. Does GPC currently support Objective-C message passing syntax,
No, but we now talking about the future of GPC.
I wouldn't say it is crucial in the sense that it is a requirement to support interfacing to Objective-C. However, if the target language you translate to supports Objective-C interfacing already and does so well, then you can save yourself the effort translating all the way down to the low-level C API of the Objective-C runtime. You can translate to the (hopefully) higher level statements of the target language and let the target language's compiler do the heavy work for you.
Yes, it is not a requirement, but it does make things easier to implement. And with more and more system (and other) software (for all kinds of platforms) written in C+ and Objective-C it becomes more and more important to be able to interface with C++ and Objective-C.
Regards,
Adriaan van Os
J. David Bryan wrote:
On Thursday, August 5, 2010 at 9:51, Frank Heckenbach wrote:
The section is a bit brief.
A more complete discussion of Ada for programmers experienced with other languages is here:
http://www.adaic.org/docs/distilled/adadistilled.pdf
This may prove helpful in understanding features, although again it is based on the older 1995 Ada revision.
The current (2005) Ada Language Reference Manual is available here:
http://www.adaic.com/standards/ada05.html
...although the LRM is a precise description of the language and is less accessible than a tutorial (though it is far more readable than ISO 10206!).
Thanks.
Can it also be used, say, to convert any data type to an "array of char" (in C/Pascal notation) and back, to do "untyped memory block" operations?
There are no untyped values in Ada :-), but it can convert to an equivalent array of bytes or words, for example. Basically, Unchecked_Conversion can convert idempotently between any two objects of the same size, and will convert with implementation-defined results between objects of differing sizes.
That's what I meant (therefore the quotes around "untyped" -- strictly speaking, also in C and BP there are no untyped objects, just e.g. arrays of char which are often treated as untyped blocks of memory; though C has "void", there are no objects of type void, it's used to indicate "functions that don't return anything", i.e. what we call procedures; untyped pointers "void *" ("Pointer" in BP) are type-cast to some other pointer type for operating on their targets).
OK, though I'm still no Ada expert, I'd guess that Ada would also be a suitable target language, so if we'd choose to go for a converter, the choice of target language might now depend on who's most familiar with it or prefers it personally. So far it seems to be only one for Ada (you) and only one for C++ (me).
Frank
Adriaan van Os wrote:
Objective Modula-2 wrote:
J. David Bryan wrote:
I'm still not clear on Adriaan's original point, though. Does GPC currently support Objective-C message passing syntax,
No, but we now talking about the future of GPC.
I wouldn't say it is crucial in the sense that it is a requirement to support interfacing to Objective-C. However, if the target language you translate to supports Objective-C interfacing already and does so well, then you can save yourself the effort translating all the way down to the low-level C API of the Objective-C runtime. You can translate to the (hopefully) higher level statements of the target language and let the target language's compiler do the heavy work for you.
Yes, it is not a requirement, but it does make things easier to implement. And with more and more system (and other) software (for all kinds of platforms) written in C+ and Objective-C it becomes more and more important to be able to interface with C++ and Objective-C.
As for C++, that's exactly my original proposal. :-)
As for Objective-C, I'm not very familiar with it. Do you have a list of features that would need to be added to support these libraries? (I.e., not necessarily 1:1 compatibility, but the major obstacles that can't be easily worked around -- for comparison, in C++ this would be templates, automatic con-/destructors and exceptions, in this order.) -> see below
Objective Modula-2 wrote:
J. David Bryan wrote:
I'm still not clear on Adriaan's original point, though. Does GPC currently support Objective-C message passing syntax,
I don't think it does.
No, it doesn't.
For example, when using Objective-C style message passing syntax in Objective Modula-2 ...
VAR foo : NSArray := [[NSArray alloc] init] ;
for the ObjC target, the above only needs to be translated to
NSArray *foo = [[NSArray alloc] init ;
which is very straight forward. However, for an C target, there is a whole lot more to be done to translate just this one line.
C is, of course, as often, a bad comparison. In C++, given a suitable (one-time written) runtime support, it would become quite a bit easier.
After reading a bit about Objective-C, ISTM some of the major differences to C++ are:
- Classes are also objects (so you can, e.g., pass a class to a function that uses this class to allocate any number of objects of this class). While the C++ object model doesn't support this, it can be emulated with templates, almost completely automatically, as long as the set of operations on a general class object is known (which I suppose it is -- in the examples I saw, it was mostly instantiation, but there may be a few more).
- All classes have a common ancestor. (That's not a new thing, we've had this discussion WRT Pascal object models.) Many Objective-C proponents seem to tout this as a big advantage, I see it as a restriction. E.g., that an "untyped" object pointer exists and is still useful (in contrast to a real untyped pointer). Of course, you can do the same thing in C++ if, by convention, you use a common ancestor. Your "untyped" pointer then becomes the pointer to that ancestor class. Of course, an Objective-C -> C++ mapping could do just that.
As a side note, I think this philosophy has to do with "big objects", i.e. objects that represent applications, documents, views etc. (MVC is often mentioned in this context). While even there I'm skeptical (do you really ever need a reference to an object of which you don't even know if it's a model or, say, a button?), the C++ idea is rather that almost everything (apart from simple types and arrays) can be an object, including strings, lists, or complex numbers. And I don't think I need (or want) polymorphy between those very different things.
This relates to the previously raised (and unanswered!) question how to declare, e.g. a list of "TFoo" objects. Objective-C, like all pure OOP suggestions I've seen so far, allows for a list of any object. But what if you want, like I do almost every time, a list that contains only objects of a certain type (with compile-time checking)? So far, only templates provide a generic solution AFAICS.
- Message passing. This doesn't translate directly to C++, but it could be done with some extra effort (which, of course, a converter tool could do automatically), such as adding a dispatcher method (virtual in C++ terms) to the "ancestor object", and registering all methods with it (by name, or a hash of the name).
So, AFAICS, it would be possible to implement the language features of Objective-C in a C++ output through a converter tool with moderate effort.
However, the original question seems to be more about binary compatibility to existing Objective-C code, which would probably not be achieved this way.
Frank
On 10 Aug 2010, at 10:59, Frank Heckenbach wrote:
- All classes have a common ancestor.
Most classes have a common ancestor, but not all of them do. There are at least two root classes in the Cocoa framework: NSObject (the most commonly used one) and NSProxy (mostly used for language bridges and for remote message passing).
You can define as many root classes as you want, with the only restriction that they have to conform to the NSObject protocol (a protocol is similar to an interface in Java/Delphi-style Pascal -- and yes, protocols and classes can have the same name).
(That's not a new thing, we've had this discussion WRT Pascal object models.) Many Objective-C proponents seem to tout this as a big advantage, I see it as a restriction. E.g., that an "untyped" object pointer exists and is still useful (in contrast to a real untyped pointer).
There is an "untyped object pointer" in Objective-C, but it's unrelated to any root class. This type is called "id", and it is actually more like universally typed: without type casting, you can a) assign an id-typed variable to any variable of another class type (or of the id type), and vice versa b) send any message to it without type casting it to a particular class type (with of course the possibility of getting an exception at run time if the actual type does not respond to that message's selector)
However, the original question seems to be more about binary compatibility to existing Objective-C code, which would probably not be achieved this way.
GCC (and LLVM's Clang) support Objective-C++, which allows mixing Objective-C and C++ in the same source file. You would have to translate Objective-Pascal to Objective-C, not to C++.
Jonas
Jonas Maebe wrote:
On 10 Aug 2010, at 10:59, Frank Heckenbach wrote:
- All classes have a common ancestor.
Most classes have a common ancestor, but not all of them do. There are at least two root classes in the Cocoa framework: NSObject (the most commonly used one) and NSProxy (mostly used for language bridges and for remote message passing).
You can define as many root classes as you want, with the only restriction that they have to conform to the NSObject protocol (a protocol is similar to an interface in Java/Delphi-style Pascal -- and yes, protocols and classes can have the same name).
Then I probably got confused by that. But still trying to understand the differences coming from C++ one could say, protocols (like Java/Delphi interfaces) correspond to abstract classes with only abstact virtual methods, implementing interfaces corresponds to (multiple) inheritance, so the NSObject protocol would be the common ancestor, all classes implement (inherit from) it, and "id" would then be a pointer/reference to "NSObject_protocol".
(That's not a new thing, we've had this discussion WRT Pascal object models.) Many Objective-C proponents seem to tout this as a big advantage, I see it as a restriction. E.g., that an "untyped" object pointer exists and is still useful (in contrast to a real untyped pointer).
There is an "untyped object pointer" in Objective-C, but it's unrelated to any root class. This type is called "id", and it is actually more like universally typed: without type casting, you can a) assign an id-typed variable to any variable of another class type (or of the id type),
And if the object that the id-variable references is not of the other class type? Do you get a runtime exception, or does the assignment still work? The latter would seem wrong to me, the former corresponds to C++ dynamic_cast.
Frank
On 10 Aug 2010, at 13:39, Frank Heckenbach wrote:
Jonas Maebe wrote:
You can define as many root classes as you want, with the only restriction that they have to conform to the NSObject protocol (a protocol is similar to an interface in Java/Delphi-style Pascal -- and yes, protocols and classes can have the same name).
Then I probably got confused by that. But still trying to understand the differences coming from C++ one could say, protocols (like Java/Delphi interfaces) correspond to abstract classes with only abstact virtual methods, implementing interfaces corresponds to (multiple) inheritance, so the NSObject protocol would be the common ancestor,
Correct, I think.
all classes implement (inherit from) it, and "id" would then be a pointer/reference to "NSObject_protocol".
First some terminology: * message =~ method * selector (or message selector) =~ message name (in Objective-C, conceptually all dispatching happens based on the message's selector rather than based on vtable indices; in practice, several optimizations are used to minimise the overhead)
Anyway, id is more than just a reference to NSObject_protocol, since you can (without any type casts) send any message to "id". Following the C tradition, that message's selector doesn't even have to known to the compiler (although, just like with most C compilers, you'll get a warning in case the selector isn't declared in the current scope).
E.g., in any Objective-C program, you can write something like this:
id obj, res; ...
res = [ obj nonExistantMessageWithIntPara: 1 andPara: 4.0 ];
This gets translated by the compiler into an equivalent of:
type tempmsgtyp = function(obj: pointer; sel: pselector; arg1: cint; arg2: double): id; cdecl; ... res := tempmsgtyp(objc_msgSend)(obj, sel_getUid('nonExistantMessageWithIntPara:andPara:'), 1, 4.0);
You can also declare a variable as pointing to something conforming to the NSObject protocol, but that would be "id<NSObject> obj", and in that case the compiler only allows sending messages from the NSObject protocol to obj.
In case you are interested in something that cannot be expressed by C+ +'s object model, then have a look at Objective-C categories. A category extends a class without inheritance. It can only declare messages, and for all messages declared in that category the following holds: a) if they don't exist in the extended class, then they are added to the extended class as if they appeared in its "real" declaration (i.e., they are also automatically inherited by all descendants; all messages in Objective-C behave functionally equivalent to "virtual" methods) b) if they do exist in the class, then the category's message *replaces* the original message definition in the class (so if a derived class calls the inherited selector, it will end up in the category's version of the message; and calling the inherited selector from the category version will end you up in the message of the parent class of the extended class: the replaced message itself is gone).
If multiple category's add/replace the same message in the same class, then the result is undefined (which message "wins" depends on the order in which the category's are loaded by the Objective-C runtime). Using this particular functionality of categories is also strongly discouraged.
Category's obviously allow for really ugly code and really cool hacks, but in the Cocoa framework they are mostly used to simply decompose class definitions in functionally independent parts (so no replacing of messages). The advantages are a) they reduces the fragile base class problem (you can add messages even to base classes defined in the system frameworks, e.g., some string processing method specific to your application can be added to the base NSString class) b) smaller header files in case you don't need all functionality (speeding up parsing because of the C model of including headers)
To avoid name clashes, it is recommended to begin all of the selectors with some company/person-specific prefix. It can obviously lead to ugly situations, but in practice seems to work quite well.
And if the object that the id-variable references is not of the other class type? Do you get a runtime exception, or does the assignment still work?
The assignment works, no check occurs (although I guess llvm's Clang may warn if it can statically determine that the assignment is unsafe).
The latter would seem wrong to me, the former corresponds to C++ dynamic_cast.
There are runtime functions you can call to determine inheritance compatibility, so in principle you could implement such a cast expression yourself via the preprocessor (or the compiler could insert checks). But it's not part of the language and indeed dangerous, and I believe that this causes a lot more problems than the category functionality.
Jonas
Frank Heckenbach wrote:
As for Objective-C, I'm not very familiar with it. Do you have a list of features that would need to be added to support these libraries?
The minimum required would be: * declare a class * declare a protocol * add a method to a class * send a message
depending on implementation you may also need to expose * defining a selector * testing of selectors for equality
Note, that in ObjC certain operations that are basic operations elsewhere are just messages, for example instantiation of a class is a message. However, at the runtime API level, instantiation of a class is a separate API call.
In a language that already has its own object model and syntax to declare classes and methods, the first design decision to be made is whether or not to use the same syntax for declaring ObjC classes. If native syntax is to be reused then the next question is how to distinguish between native classes and ObjC classes.
One possible route would be to use a qualifier in the header of the compilation unit and use the exact same syntax within the compilation unit. For example ...
Unit FooLib; type FooClass = class ... end; (* native class *) end FooLib.
Unit BarLib Foreign "ObjC"; type BarClass = class ... end; (* ObjC class *) end BarLib.
Another possibility would be to add a qualifier to the declaration itself ...
type FooClass = class ... end; (* native class *) type BarClass = objc class ... end; (* ObjC class *)
Objective-C has four visibility modes for instance variables: public, package, protected and private. One of those can be made default, for the other three some kind of qualifier will be needed. Possibly such qualifiers already exist in the native syntax.
Although it may seem convenient to reuse native syntax, it may lead to inconveniences further down the road.
A very important feature in Objective-C is the ability to add a method to a class outside of the scope of the compilation unit where the class is declared. In Smalltalk and Objective-C this is called a category.
The consequence of this is that if your syntax embeds your method declarations inside your class declarations then you need yet another syntax for out-of-class method declarations in order to support this important feature of Objective-C.
When we designed the Objective Modula-2 extensions this was probably the most difficult thing. We tried several approaches and it always seemed like opening a can of worms.
We tried the class-is-a-record approach with methods declared inside a class ...
TYPE FooClass = CLASS ( NSObject ) a, b, c : INTEGER; (* instance variables *) PROCEDURE methodFoo ... PROCEDURE methodBar ... END; (* FooClass *)
... but this posed a problem when trying to support the category feature because there didn't seem to be any good way to declare a method outside of the class and link that method to a class declared elsewhere.
We then tried the class-is-a-module approach. This solved the problem how to support the category feature as categories became modules but now we had program modules, library definition modules, library implementation modules, class definition modules, class implementation modules, category definition modules and category implementation modules. And all of these had to have different rules as to what can be declared in them and what the import and export rules were. This wasn't really satisfactory.
At a time when we had almost given up looking for a better approach, we found the inspiration in Oberon and Oberon-2.
Oberon had replaced variant records with extensible record types. A record declaration can be based on another record type which it then extends.
TYPE FooClass = RECORD ( BaseClass ) ... END;
Oberon-2 added a feature called type bound procedures. This turns extensible record types into classes with out-of-class method declarations.
TYPE FooClass = RECORD ( BaseClass ) foo, bar : INTEGER; (* instance variables *) END; (* FooClass *)
PROCEDURE (self : FooClass) setFoo (foo : INTEGER); PROCEDURE (self : FooClass) setBar (bar : INTEGER);
Adopting both these concepts in Objective Modula-2 allowed us to get rid of all those extra module types and provide the ability to add methods to a class outside the compilation unit in which it is declared (aka category).
Now a category is simply a library module that imports a class from another library module and then declares one or more methods targeting that class.
DEFINITION MODULE FooCategory; FROM FooLib IMPORT FooClass; PROCEDURE (self : FooClass) multBar (by : INTEGER); END FooCategory.
(Note: we actually use CLASS and METHOD in Objective Modula-2)
Now, I am not saying that GPC should follow this approach, but this example illustrates that one design decision that looks convenient at first could lead to an inconvenience elsewhere. It is important to look at the features required to support Objective-C as a whole, not in isolation.
Another tricky Objective-C feature is the messaging. Again, at first it may seem convenient to simply map this to whatever the native function or procedure call syntax is, but this too can lead to unwanted and inconvenient consequences further down the road.
A simple method invocation in Objective-C looks like this:
[obj message: param];
which could be mapped to by Pascal syntax like this:
obj^.message(param);
However, method invocations in Objective-C can look like this:
[foobar setFoo: 123 andBar: 456];
the method's name is actually setFoo:andBar:, including the colons. The Pascal compiler would have to know how to reconstruct this name from whatever the Pascal name is. Since Pascal doesn't allow colons in identifiers this gets a little tricky.
The Python Objective-C bridge uses underscores to stand in for the colons ...
foobar.setFoo_andBar_(123, 456);
Luckily, the Objective-C style guidelines recommend camel case identifiers and underscores are almost never seen in Objective-C code. Still, the method names can get rather long and their readability suffers under the PyObjC scheme.
Moreover, it is very common in Objective-C to chain and to nest messages ...
[[[[foobar setFoo: 123] rotateLeft: 456] splitAt: 789] shuffle]; [foobar setFoo: [foo count]] rotateLeft: [barbaz [bam boo: 123]]];
in PyObjC style mappings these become
fooobar.setFoo_(123).rotateLeft_(456).splitAt_(789).shuffle; foobar.setFoo_(foo.count).rotateLeft_(barbaz(bam.boo_(123)));
in schemes that simply map [obj message] to Send(obj, message) such as cocoa-gnat does, this becomes extremely bad ...
Send(Send(Send(Send( foobar, "setFoo:", 123), "rotateLeft:", 456), "splitAt:", 789), "shuffle"); Send(Send(foobar, "setFoo:, Send(foo, "count"), "rotateLeft:", Send(barbaz, Send(bam, "boo:", 123))));
The many levels of square brackets in the Objective-C message passing syntax can be annoying, too, but the mappings to other notations usually become unreadable very quickly.
Many Objective-C developers like the original Smalltalk syntax better than the Objective-C syntax, where
[[foobar setFoo: 123] shuffle];
is
foobar setFoo: 123 shuffle;
in cases where the message chain may be ambiguous, parentheses are used in Smalltalk. The above example could also be written as
(foobar setFoo: 123) shuffle;
It might make more sense to build an interface that uses the Smalltalk syntax in some shape or form, either by way of an escape syntax, for example like Objective Modula-2 does:
(* backquote escapes to Smalltalk syntax *) `foobar setFoo: 123 shuffle;
or by using a variadic Send function that takes message strings:
Send("foobar setFoo:", 123, "shuffle");
there are probably other ways, but in my experience it is a good idea to seek inspiration in the original Smalltalk message passing syntax.
C++ this would be templates,
Objective-C doesn't have templates nor does it require them because you can choose between static and dynamic typing. Either way, you don't have to implement any template syntax translations.
automatic con-/destructors
In Objective-C constructors and destructors are just messages, so there is no need for special syntax and translations.
foo := [[FooClass alloc] initWithFoo: 123];
exceptions
Again, in Objective-C, exceptions are also just messages
[fooException raise];
Although there is special syntax for critical sections and try-catch-finally blocks. However, in Cocoa the exception system is based on a class NSException, so you should be able to simply plug into that.
After reading a bit about Objective-C, ISTM some of the major differences to C++ are:
The most important difference is that Objective-C is late bound. Everything happens at runtime. Classes are added at runtime. Protocols are added at runtime. Methods are added at runtime. It's all dynamic. You can send a message to a class that doesn't implement the message and the class can forward that message to a delegate. You can test at runtime if a class implements a message ("respondsTo:"). You can override the methods of the superclass, etc etc etc. All this happens at runtime.
So, AFAICS, it would be possible to implement the language features of Objective-C in a C++ output through a converter tool with moderate effort.
However, the original question seems to be more about binary compatibility to existing Objective-C code, which would probably not be achieved this way.
I don't think you would want to merely reimplement the ObjC runtime library and hook into it with your own syntax. Sure, this might be an interesting thing to do and it might actually turn out useful.
However, the primary reason for providing an Objective-C interface is usually to be able to use the Cocoa or GNUstep APIs. For that, you will need to interface with the Objective-C runtime library.
Whichever way you do that, the Objective-C runtime reference will probably be required reading:
http://developer.apple.com/mac/library/documentation/Cocoa/Reference/ObjCRun...
Frank Heckenbach wrote:
protocols (like Java/Delphi interfaces) correspond to abstract classes with only abstact virtual methods, implementing interfaces corresponds to (multiple) inheritance
In his book "Programming Language Pragmatics", Michael L.Scott distinguishes between multiple inheritance of specification and multiple inheritance of implementation.
Protocols provide multiple inheritance of specification but not multiple inheritance of implementation.
so the NSObject protocol would be the common ancestor, all classes implement (inherit from) it, and "id" would then be a pointer/reference to "NSObject_protocol".
The terminology for ancestor is superclass and root class (top level) and protocols are not classes, so they are not considered ancestors.
The terminology for implementing a protocol is "conform to" and you can send a "conformsTo:" message to a class asking it whether it conforms to a given protocol. Protocols can also conform to other protocols.
On Monday, August 9, 2010 at 13:52, Objective Modula-2 wrote:
For example, when using Objective-C style message passing syntax in Objective Modula-2 ...
On Monday, August 9, 2010 at 8:09, Adriaan van Os wrote:
No, but we now talking about the future of GPC.
Thank you both for your explanations.
-- Dave
On Tuesday, August 10, 2010 at 10:50, Frank Heckenbach wrote:
...the choice of target language might now depend on who's most familiar with it or prefers it personally. So far it seems to be only one for Ada (you) and only one for C++ (me).
I have no vested interest in what language you use. I merely wished to suggest that Ada was perhaps a richer language than you recalled from University.
-- Dave
Hi,
On 8/10/10, J. David Bryan jdbryan@acm.org wrote:
On Tuesday, August 10, 2010 at 10:50, Frank Heckenbach wrote:
...the choice of target language might now depend on who's most familiar with it or prefers it personally. So far it seems to be only one for Ada (you) and only one for C++ (me).
I have no vested interest in what language you use. I merely wished to suggest that Ada was perhaps a richer language than you recalled from University.
While I don't know either, I suppose (barely educated guess) that C++ and Ada are both good languages. However, it's clear that C++ is much more popular (so?). Also, I read the Joachim contest-winner-dude complaining that NetBSD's GNAT was behind the times. So you'd still have issues with various platforms not being updated as much, no matter what you do. (Even with C++ with G++ 3.2 or 4.1 issues for lesser platforms.)
I hate the idea that popularity means anything, but you can't usually ignore it. However, I really do think Ada is worth considering seriously. (I know Gautier would agree, and he's a [former?] Pascal nut.)
But if Frank is expected to do most of the work, it'll probably end up being C++ (which is fine, just saying ...). I guess I need to check out _C++ for Dummies_ from the library one of these days (or read Eckel's PDF book). Well, I'll never be much help, I suck too much. But I'm still curious and maybe can help in some other way. ;-)
I add 1 vote for Ada; 1 against C++.
HF
Frank Heckenbach wrote:
............ OK, though I'm still no Ada expert, I'd guess that Ada would also be a suitable target language, so if we'd choose to go for a converter, the choice of target language might now depend on who's most familiar with it or prefers it personally. So far it seems to be only one for Ada (you) and only one for C++ (me).
Frank
On 11 Aug 2010 at 12:28, Prof. Harley Flanders wrote:
I add 1 vote for Ada; 1 against C++.
Two votes?
For me, maximum cross-platform support is absolutely essential. Hence my vote is for C++.
Best regards, The Chief -------- Prof. Abimbola A. Olowofoyeku (The African Chief) web: http://www.greatchief.plus.com/
Prof. Harley Flanders wrote:
I add 1 vote for Ada; 1 against C++.
So if we go for Ada, you will actively contribute, and if we go for C++, you will actively obstruct the project? Or what exactly does the negative vote mean?
Prof. Harley Flanders wrote:
Sure, you can store a triangular matrix in a single array. Now write a procedure for multiplying two of them.
Really simple and transparent, right (or wrong)?
Sure, why not?
You didn't plan to hand-code the access formula every time it's needed, rather than once in a subroutine (or perhaps even an operator), did you?
Frank
On Thu, 12 Aug 2010 03:28:21 am Frank Heckenbach wrote:
Prof. Harley Flanders wrote:
I add 1 vote for Ada; 1 against C++.
So if we go for Ada, you will actively contribute, and if we go for C++, you will actively obstruct the project? Or what exactly does the negative vote mean?
I don't wish to speak for the Professor, but in the Python community it is very common to give informal votes on suggestions.
+1 is considered a vote in favour, -1 a vote against, and 0 means to abstain.
Sometimes people will give fractional or multiple votes, to express strength of feeling: +2 would be "strongly in favour", -0.5 would be "weakly against", +0 is "I don't really care either way, but lean very slightly in favour" and -0 similarly very slightly against.
A positive vote is not a promise that you will contribute to the proposal, only that you are in favour of it. Similarly a negative vote is not a threat to obstruct it.
Such votes are generally used to gauge community feeling. Its rare for anyone to tally the votes, and even when somebody does, the decision is still not made on the basis of democratic vote. (However, it would be very rare for the Python development team to implement something that had overwhelming community opposition.)
On Wed, 11 Aug 2010, Steven D'Aprano wrote:
On Thu, 12 Aug 2010 03:28:21 am Frank Heckenbach wrote:
Prof. Harley Flanders wrote:
I add 1 vote for Ada; 1 against C++.
So if we go for Ada, you will actively contribute, and if we go for C++, you will actively obstruct the project? Or what exactly does the negative vote mean?
I don't wish to speak for the Professor, but in the Python community it is very common to give informal votes on suggestions.
That's the way I took it. As long as it's understood that the developers (ie. Frank) will make the final decision and that anything we say is advisory, I see no harm in it.
My concern with generating Ada code is that GNAT needs an Ada compiler to compile from source and that such are not easy to find on all platforms (this is also a concern if GPC is rewritten in Pascal), whereas C++ compilers are a lot easier to find. I'd still like to see GPC rewritten in Pascal, but the cost will be facilitating compilation on new platforms.
--------------------------| John L. Ries | Salford Systems | Phone: (619)543-8880 x107 | or (435)867-8885 | --------------------------|
Objective Modula-2 wrote:
Frank Heckenbach wrote:
protocols (like Java/Delphi interfaces) correspond to abstract classes with only abstact virtual methods, implementing interfaces corresponds to (multiple) inheritance
In his book "Programming Language Pragmatics", Michael L.Scott distinguishes between multiple inheritance of specification and multiple inheritance of implementation. Protocols provide multiple inheritance of specification but not multiple inheritance of implementation.
As I said I was trying to understand the differences coming from C++, i.e. seeing if and how Objective-C features can be mapped to C++ features.
Yes, in C++ inheritance applies to specification and implementation, but when there is no implementation, i.e. an abstract class with only abstact virtual methods, inheriting from it is effectively of specification only.
so the NSObject protocol would be the common ancestor, all classes implement (inherit from) it, and "id" would then be a pointer/reference to "NSObject_protocol".
The terminology for ancestor is superclass and root class (top level) and protocols are not classes, so they are not considered ancestors.
Terminology aside, I still think that protocols can be mapped to "pure abstract" C++ classes (of course, resolving a name conflict between a protocol and a class with the same name), and then the NSObject protocol would be implemented in (Objective-C) or inherited from by (C++) any other protocol and class.
The terminology for implementing a protocol is "conform to" and you can send a "conformsTo:" message to a class asking it whether it conforms to a given protocol. Protocols can also conform to other protocols.
"conformsTo:" would then correspond to an inheritance test (like "is" in Pascal, or trying a dynamic_cast in C++), in the special case that the right operand is a (C++ class that corresponds to an Objective-C) protcol.
In a language that already has its own object model and syntax to declare classes and methods, the first design decision to be made is whether or not to use the same syntax for declaring ObjC classes. If native syntax is to be reused then the next question is how to distinguish between native classes and ObjC classes.
Some ideas:
- Keep the differences as small as possible (ideally: Objective-C class -> C++ class with an additional, auto-generated dispatch method).
- Use compiler switches, as GPC already does for the 4 supported object models.
- Actually support Objective-C syntax (if it doesn't cause too many syntax conflicts), whether or not one is trying to keep compatible objects.
Objective-C has four visibility modes for instance variables: public, package, protected and private. One of those can be made default, for the other three some kind of qualifier will be needed. Possibly such qualifiers already exist in the native syntax.
Except for package (though in BP, private actually means package (unit) wide visibility, not entirely private).
A very important feature in Objective-C is the ability to add a method to a class outside of the scope of the compilation unit where the class is declared. In Smalltalk and Objective-C this is called a category.
IIUC, this works retroactively, i.e. if a module A calls a method of (sends a message to) an object declared in a module B, and module C extends that object via a category, even if neither A nor B know about C, it affects this call/message, right?
Another tricky Objective-C feature is the messaging. Again, at first it may seem convenient to simply map this to whatever the native function or procedure call syntax is, but this too can lead to unwanted and inconvenient consequences further down the road.
A simple method invocation in Objective-C looks like this:
[obj message: param];
which could be mapped to by Pascal syntax like this:
obj^.message(param);
However, method invocations in Objective-C can look like this:
[foobar setFoo: 123 andBar: 456];
the method's name is actually setFoo:andBar:, including the colons. The Pascal compiler would have to know how to reconstruct this name from whatever the Pascal name is. Since Pascal doesn't allow colons in identifiers this gets a little tricky.
Sure, but this seems like a rather minor problem to me. We already need name mangling (as does e.g. C++) to map overloaded operator names and other stuff to names representable on the assembler/linker level, so we could probably employ this or something similar here.
Moreover, it is very common in Objective-C to chain and to nest messages ...
[[[[foobar setFoo: 123] rotateLeft: 456] splitAt: 789] shuffle]; [foobar setFoo: [foo count]] rotateLeft: [barbaz [bam boo: 123]]];
in PyObjC style mappings these become
fooobar.setFoo_(123).rotateLeft_(456).splitAt_(789).shuffle; foobar.setFoo_(foo.count).rotateLeft_(barbaz(bam.boo_(123)));
in schemes that simply map [obj message] to Send(obj, message) such as cocoa-gnat does, this becomes extremely bad ...
Send(Send(Send(Send( foobar, "setFoo:", 123), "rotateLeft:", 456), "splitAt:", 789), "shuffle"); Send(Send(foobar, "setFoo:, Send(foo, "count"), "rotateLeft:", Send(barbaz, Send(bam, "boo:", 123))));
The many levels of square brackets in the Objective-C message passing syntax can be annoying, too, but the mappings to other notations usually become unreadable very quickly.
As long as it's about generated output code, I don't care at all.
For the user to write it, it's not nice, of course. So we could:
- Actually support Objective-C syntax (see above)
- Make Send a method (at least syntactically) that returns a reference to the object, so calls wouldn't nest as much:
foobar.Send("setFoo:", 123) .Send("rotateLeft:", 456) .Send("splitAt:", 789) .Send("shuffle"); foobar.Send("setFoo:", foo.Send("count")) .Send("rotateLeft:", barbaz.Send(bam.Send("boo:", 123)));
C++ this would be templates,
Objective-C doesn't have templates nor does it require them because you can choose between static and dynamic typing.
^[citation required]
We've had this claim in this thread before. I asked the question, which is still unanswered: How do you implement a generic list type with the following properties:
- Implemented only once in source code
- Applicable to any given type
- Statically type-safe (i.e., if you apply if to TApple, you can be sure at compile time that it contains only apples)
So far this question has been ignored. Until I see a valid response, I will ignore further claims that templates are not needed.
automatic con-/destructors
In Objective-C constructors and destructors are just messages, so there is no need for special syntax and translations.
foo := [[FooClass alloc] initWithFoo: 123];
The emphasis was on automatic. So can you, as a class designer:
- Enforce certain actions that are always done on instantiation (e.g. ensure that certain object fields are always initialized to certain values, which don't have to be constant, but might be a running index or something). I imagine this might be possible by implementing an instantiation method in the class that does this, but I'm not sure.
- Enforce certain actions that are always done on deletion of an object, whichever way this happens. E.g., if an object (not a pointer/reference) goes out of scope, its destructor is called; if a container (list, map, ...) is destroyed, for all of its contents their destructor is called, etc. A typical example is memory allocated internally -- you might point to garbage collection (which I don't consider a panacea), but to give another example, imagine that the object holds system resources (file locks, network connections) that you want to be released/closed when the object ceases to exist.
exceptions
Again, in Objective-C, exceptions are also just messages
Well, in C++ exceptions are just values (of any type).
But that doesn't answer the main features of exceptions:
[fooException raise];
Does this do what raising exceptions in other languages do, i.e. "jump" to the nearest enclosing appropriate catch-block? Or does it just send the message and continue with the next statement?
However, the primary reason for providing an Objective-C interface is usually to be able to use the Cocoa or GNUstep APIs. For that, you will need to interface with the Objective-C runtime library.
It seems so. (Which also means, since I have no real interest in either of them, I should probably get out of this discussion and leave it to those who might actually implement it.)
Frank
Frank Heckenbach wrote:
Terminology aside, I still think that protocols can be mapped to "pure abstract" C++ classes (of course, resolving a name conflict between a protocol and a class with the same name), and then the NSObject protocol would be implemented in (Objective-C) or inherited from by (C++) any other protocol and class.
"conformsTo:" would then correspond to an inheritance test (like "is" in Pascal, or trying a dynamic_cast in C++), in the special case that the right operand is a (C++ class that corresponds to an Objective-C) protcol.
- Keep the differences as small as possible (ideally: Objective-C
class -> C++ class with an additional, auto-generated dispatch method).
As I had previously mentioned, it is my understanding that the aim of this part of the discussion was to explore how an interface for GPC to make use of ObjC libraries might be designed and implemented, what the caveats and challenges are etc.
For such an interface, whatever mapping takes place would have to be undertaken in the direction from Pascal *towards* ObjC and at a lower level from GPC's intermediate language *towards* ObjC.
Now, for the sole purpose of a person unfamiliar with ObjC but familiar with C++ to gain a better understanding of ObjC, I can see some value in articulating how certain ObjC semantics might conceivably be mapped in the opposite direction.
However, in order to design and build a useful and convenient interface for using ObjC libraries from within another language, the designers/implementors should be able to think in the ObjC paradigm space. Ideally they would have implemented something meaningful in ObjC/Cocoa/GNUstep.
At this stage of the discussion, I find it increasingly difficult to tell whether the articulation of conceivable mappings of ObjC to C++ is actually meant to be for the sole purpose of gaining a better understanding of ObjC itself.
More often than not, it would seem the scope has moved confusingly close to discussing how a translator from ObjC to C++ might be designed. In order not to add to this confusion I will keep my responses to only those items of the discussion that I can clearly identify as GPC=>ObjC mapping issues.
- Use compiler switches, as GPC already does for the 4 supported
object models.
I personally prefer the language design philosophy where pragmas do not change the meaning of source code, but yes, the use of pragmas is certainly a workable approach to switching between object models.
- Actually support Objective-C syntax (if it doesn't cause too many
syntax conflicts), whether or not one is trying to keep compatible objects.
There is an increasing number of languages that follow this approach. It has its advantages but some people argue that the source code is then no longer portable to compilers that do not support the syntax.
Yet, in order to remain source code compatibility with non-ObjC interfacing dialects of the host language, there must be a complete mapping of existing syntax for the native object model to the ObjC object model. If only one piece of additional syntax is added, compatibility is lost.
As it is often difficult to map existing syntax to all the required semantics of a foreign object model, one might hold the view, if we are not going to be compatible anyway, we might as well support ObjC syntax if it makes the task easier.
Of course there is always an element of what might be described as "language esthetics". People programming in Pascal will usually prefer a more Pascal looking syntax to target ObjC APIs over verbatim ObjC.
For example, ObjC distinguishes between class methods and instance methods ...
+foo; /* class method declaration */ -bar; /* instance method declaration */
In Pascal, you could of course support this syntax verbatim, but you might want to make it at least a little more Pascal-like ...
procedure +foo; (* class method declaration *) procedure -bar; (* instance method declaration *)
or still more Pascal-like ...
class procedure foo; instance procedure bar;
or even ...
class method foo; instance method bar;
The key is to find the right balance. Where there is a significant benefit of using ObjC syntax or syntax that closely resembles ObjC, one should probably not shy away from doing so. But where there is no such benefit, one might want to use syntax that is as Pascal-like as possible.
In my experience, the area where it makes most sense to use Smalltalk or ObjC syntax (or something close) is the message passing.
Objective-C has four visibility modes for instance variables: public, package, protected and private. One of those can be made default, for the other three some kind of qualifier will be needed. Possibly such qualifiers already exist in the native syntax.
Except for package (though in BP, private actually means package (unit) wide visibility, not entirely private).
If you allow switching between object models, then it shouldn't matter that the semantics of "private" in one model do not exactly match the semantics of "private" in another model. Clearly in a multiple-object-models language, the programmer needs to be aware of the specifics of each object model he wishes to use in his code.
If there is no reserved word for "package" in GPC to mark the corresponding visibility mode of the ObjC object model, it could either be added (and only be available in when the ObjC object model is active) or another reserved word might be reused in its place, for example "unit".
A very important feature in Objective-C is the ability to add a method to a class outside of the scope of the compilation unit where the class is declared. In Smalltalk and Objective-C this is called a category.
IIUC, this works retroactively, i.e. if a module A calls a method of (sends a message to) an object declared in a module B, and module C extends that object via a category, even if neither A nor B know about C, it affects this call/message, right?
Yes, at runtime all methods are equal, there is no difference between those declared within the scope where the class itself is declared and those declared outside that scope. Categories are *lexical* extensions only. They allow spreading methods *lexically* over multiple files. At runtime, the lexical separation becomes invisible.
- Make Send a method (at least syntactically) that returns a
reference to the object, so calls wouldn't nest as much:
foobar.Send("setFoo:", 123) .Send("rotateLeft:", 456) .Send("splitAt:", 789) .Send("shuffle"); foobar.Send("setFoo:", foo.Send("count")) .Send("rotateLeft:", barbaz.Send(bam.Send("boo:", 123)));
Like I said, there are conceivably many other ways to design syntax for message passing. At the end of the day it comes down to striking the right balance of what users will find esthetically acceptable and what is practical. In any event, the message passing deserves to be given more thought and effort than other aspects of interfacing to ObjC.
Objective-C doesn't have templates nor does it require them because you can choose between static and dynamic typing.
^[citation required]
...
We've had this claim in this thread before. I asked the question, which is still unanswered: How do you implement a generic list type with the following properties:
...
So far this question has been ignored. Until I see a valid response, I will ignore further claims that templates are not needed.
This all comes down to different paradigms.
When different paradigms are discussed, it is easy to mistake statements of the form "A is *different* from B" with "A is better than B" and this often leads to a discussion of different paradigms to slide into an argument about personal preferences. I will refrain from debating preferences.
Suppose I said "When eating soup, the Japanese do not require a spoon because they drink their soup from a bowl". Would anybody pick that apart along the lines of "Unless somebody can tell me how the Japanese eat their soup if they don't have a bowl at hand, then I will simply ignore this statement and conclude that the Japanese do require a spoon to eat their soup just like I do."?
No matter what the pros and cons of eating soup one way or the other way may be, the fact remains that one way a spoon is required, and the other way a spoon is not required.
A more technical analogy would be algebraic versus reverse polish notation in electronic calculators. RPN calculators do not require parentheses because they use postfix operators that operate on a stack of intermediate results.
No matter what the pros and cons of algebraic versus reverse polish notation may be, the fact remains that RPN does not require parentheses.
Objective-C is a late bound language, C++ is early bound. That is a major difference in paradigm and as a result certain problems are solved using entirely different approaches. No matter what the pros and cons of early versus late bound may be, the fact remains that late bound languages do not require templating for generics.
Besides, you were missing the point which was this: Because ObjC is late bound, you do not require any mappings for templating syntax because there is no such thing.
In Objective-C constructors and destructors are just messages, so there is no need for special syntax and translations.
foo := [[FooClass alloc] initWithFoo: 123];
The emphasis was on automatic. So can you, as a class designer:
- Enforce certain actions that are always done on instantiation
(e.g. ensure that certain object fields are always initialized to certain values, which don't have to be constant, but might be a running index or something). I imagine this might be possible by implementing an instantiation method in the class that does this, but I'm not sure.
Indeed.
By convention, ObjC classes have a method for allocation (called "alloc") and at least one method for initialisation (called "init"). These methods are by default inherited from the super-class. However, you can of course override them with class specific implementations. Typically, a class specific implementation of init would first invoke the super-class' init method and then carry out any class specific initialisations thereafter.
In any event, a class is always instantiated by sending and alloc message to it and then sending an init message to the newly allocated object. There is no special syntax.
Thus, when you see a class instantiation such as ...
foo = [FooClass alloc] init];
if you did implement a FooClass specific init method, then the init message will invoke that method
if you did not implement a FooClass specific init method, then the init message will invoke the init method inherited from the FooClass' super-class.
Overriding an inherited method is as simple as declaring and implementing the method just like any other method.
- Enforce certain actions that are always done on deletion of an
object, whichever way this happens.
Same principle as with initialisation.
Again by convention, ObjC classes have a method for retaining an object (called "retain") and a method for releasing and object (called "release"). In addition there is also a method called "autorelease". Again, these methods are inherited from the super-class.
When you send a retain message to an object, its reference count is incremented, when you send a release message to it, its reference count is decremented. If as a result of your release message the reference count goes to zero, the object is deallocated. Thus, even if you send a release message, should there be still other objects who retained the object, it will not immediately be deallocated but only when the last retaining object sent a release message.
Again, you can override the inherited methods with class specific implementations that perform additional tasks.
E.g., if an object (not a pointer/reference) goes out of scope, its destructor is called; if a container (list, map, ...) is destroyed, for all of its contents their destructor is called, etc.
You can instantiate a class and send it an autorelease message ...
foo = [FooClass alloc] init] autorelease];
in which case it will be managed by a so called autorelease pool that will deallocate the object when it goes out of scope.
In any event, anything that happens to an object during its lifetime from allocation to deallocation happens to it by sending messages to it which invokes methods implemented in the object's class or any of its super-classes (up to the root class). Any of these methods can be overridden. Any of these messages can be intercepted and redirected. The extend of your control is only limited by your knowledge of the API. There is no hidden magic anywhere that only the compiler has access to. Everything is accessible and controllable from within your own ObjC code.
For the avoidance of confusion I should perhaps mention that when you see other messages being used to instantiate, for example ...
foo = [FooClass new];
or
foo = [FooClass newWithBar: 123 andBaz: 456];
or
foo = [FooClass newFromArray: bar];
these are messages implemented by methods on top of alloc and init. For example method new would invoke alloc and init internally. In other words it is just a convenience method. It is customary for class implementors to provide several newWith... or newFrom... methods as they see fit.
[fooException raise];
Does this do what raising exceptions in other languages do, i.e. "jump" to the nearest enclosing appropriate catch-block? Or does it just send the message and continue with the next statement?
Depending on how your exception handling code handles the exception. It may resume, it may re-raise or raise another exception or it may abort. Whatever you see fit.
However, the primary reason for providing an Objective-C interface is usually to be able to use the Cocoa or GNUstep APIs. For that, you will need to interface with the Objective-C runtime library.
It seems so. (Which also means, since I have no real interest in either of them, I should probably get out of this discussion and leave it to those who might actually implement it.)
Implementing a useful and convenient interface to ObjC is certainly not a minor effort, so nobody can blame you if you don't want to get involved. However, as the primary compiler maintainer, even if somebody else came along to implement such an interface, you probably would want to keep yourself in the loop simply because any such interface would/will have an impact on the compiler as a whole. After all, whatever its exact shape might be, it would/will add an additional object model to the language.
Objective Modula-2 wrote:
As I had previously mentioned, it is my understanding that the aim of this part of the discussion was to explore how an interface for GPC to make use of ObjC libraries might be designed and implemented, what the caveats and challenges are etc.
For such an interface, whatever mapping takes place would have to be undertaken in the direction from Pascal *towards* ObjC and at a lower level from GPC's intermediate language *towards* ObjC.
Now, for the sole purpose of a person unfamiliar with ObjC but familiar with C++ to gain a better understanding of ObjC, I can see some value in articulating how certain ObjC semantics might conceivably be mapped in the opposite direction.
However, in order to design and build a useful and convenient interface for using ObjC libraries from within another language, the designers/implementors should be able to think in the ObjC paradigm space. Ideally they would have implemented something meaningful in ObjC/Cocoa/GNUstep.
Sure, but I'll leave this part to those who might actually implement it.
At this stage of the discussion, I find it increasingly difficult to tell whether the articulation of conceivable mappings of ObjC to C++ is actually meant to be for the sole purpose of gaining a better understanding of ObjC itself.
I'm also trying to find out about the benefits of Objective-C as a language (not only as the de-facto interface of Cocoa and GNUstep).
- Use compiler switches, as GPC already does for the 4 supported
 object models.
I personally prefer the language design philosophy where pragmas do not change the meaning of source code, but yes, the use of pragmas is certainly a workable approach to switching between object models.
We're talking about GPC extensions, and GPC works this way massively already.
- Actually support Objective-C syntax (if it doesn't cause too many
 syntax conflicts), whether or not one is trying to keep compatible  objects.
There is an increasing number of languages that follow this approach. It has its advantages but some people argue that the source code is then no longer portable to compilers that do not support the syntax.
Yet, in order to remain source code compatibility with non-ObjC interfacing dialects of the host language, there must be a complete mapping of existing syntax for the native object model to the ObjC object model. If only one piece of additional syntax is added, compatibility is lost.
As it is often difficult to map existing syntax to all the required semantics of a foreign object model, one might hold the view, if we are not going to be compatible anyway, we might as well support ObjC syntax if it makes the task easier.
Agreed. The main question is if the syntax can easily be mixed in. But Bison will tell us so quickly (which is why, if someone goes this route, I'd suggest to add the grammar first and see if it works without conflicts -- if there are (unsolvable) conflicts, it will be early enough to redesign, and one knows specifically what the problematic areas are).
Objective-C has four visibility modes for instance variables: public, package, protected and private. One of those can be made default, for the other three some kind of qualifier will be needed. Possibly such qualifiers already exist in the native syntax.
Except for package (though in BP, private actually means package (unit) wide visibility, not entirely private).
If you allow switching between object models, then it shouldn't matter that the semantics of "private" in one model do not exactly match the semantics of "private" in another model. Clearly in a multiple-object-models language, the programmer needs to be aware of the specifics of each object model he wishes to use in his code.
If there is no reserved word for "package" in GPC to mark the corresponding visibility mode of the ObjC object model, it could either be added (and only be available in when the ObjC object model is active) or another reserved word might be reused in its place, for example "unit".
This shouldn't be a big problem. Actually, what I meant is that semantically, GPC already has the four visibility modes (though I'm not sure if it implements them currently, but if not, it should). Adding a new (conditional) keyword is also not a big deal.
A very important feature in Objective-C is the ability to add a method to a class outside of the scope of the compilation unit where the class is declared. In Smalltalk and Objective-C this is called a category.
IIUC, this works retroactively, i.e. if a module A calls a method of (sends a message to) an object declared in a module B, and module C extends that object via a category, even if neither A nor B know about C, it affects this call/message, right?
Yes, at runtime all methods are equal, there is no difference between those declared within the scope where the class itself is declared and those declared outside that scope. Categories are *lexical* extensions only. They allow spreading methods *lexically* over multiple files. At runtime, the lexical separation becomes invisible.
OK. I'm mostly getting confused by the word "category" which in this sense seems to have no relation to its common language nor its mathematical meaning.
So IIUC, this eliminates many optimization opportunities, since even if all involved class declarations and implementations are fully known, the compiler can't know if some of them won't be modified later with a category.
Objective-C doesn't have templates nor does it require them because you can choose between static and dynamic typing.
^[citation required]
...
We've had this claim in this thread before. I asked the question, which is still unanswered: How do you implement a generic list type with the following properties:
...
So far this question has been ignored. Until I see a valid response, I will ignore further claims that templates are not needed.
This all comes down to different paradigms.
When different paradigms are discussed, it is easy to mistake statements of the form "A is *different* from B" with "A is better than B" and this often leads to a discussion of different paradigms to slide into an argument about personal preferences. I will refrain from debating preferences.
Suppose I said "When eating soup, the Japanese do not require a spoon because they drink their soup from a bowl". Would anybody pick that apart along the lines of "Unless somebody can tell me how the Japanese eat their soup if they don't have a bowl at hand, then I will simply ignore this statement and conclude that the Japanese do require a spoon to eat their soup just like I do."?
Wrong comparison. If it's really the case that I'm missing something [-> drinking from a bowl], then my question would translate to: "Unless somebody can tell me how the Japanese eat their soup without using spoons, I doubt they can eat soup at all", and you could reply, "They drink it from a bowl", and I'd stand corrected.
I didn't make any arbitrary restrictions [-> no bowl]. I asked about a very concrete use-case (list of any give type), and how to implement it. Of course, any features of Objective-C are permissible.
A more technical analogy would be algebraic versus reverse polish notation in electronic calculators. RPN calculators do not require parentheses because they use postfix operators that operate on a stack of intermediate results.
And it can be shown, in a mathematically rigorous way, that anything that can be expressed in of those notations, can be in the other one as well. That's just what I'm asking here. So far you've just said one wouldn't want to do it. (To apply it to the analogy, with a grain of salt, it's as if I asked how you expressed logarithms in your notation, and you said you don't want to write logarithms because in practice you'd approximate them with the first 3 terms of their Taylor series which requires only elementary arithmetics.)
Objective-C is a late bound language, C++ is early bound. That is a major difference in paradigm and as a result certain problems are solved using entirely different approaches.
Well, are they solved? Again, I restate my question: Suppose you use a list and want to be sure that it only contains objects of a certain type.(*) How do you do it? (This is not a rhetorical question, that's the very question whose answer I'm missing.)
- You can add runtime checks (or the language does them automatically). But then, if you get a wrong type, the program will just raise an error (in whichever form). If you shipped the program, it's too late. So this is an unsatisfying solution.
- You can do extensive testing before shipping. But as we all know, testing can only find bugs and never prove the absence of bugs. So you still have no assurance.
- Formal program verification is hopeless for any somewhat bigger program.
- In contrast: With schema types, you declare your types and the compiler does all the checks before the program is even run.
(*) I'd be surprised if I had to give more concrete examples, as in my own code almost 100% of my lists (maps, trees, ...) are of this kind rather than a "list of anything".
In the end it boils down to compile-time vs. runtime checking. Needless to say, I'm a strong proponent of compile-time checking whenever possible. Since Objective-C is a typed language (at least to some degree -- not all variables are of type "id", but often of specific classes), it's not unreasonable to expect this to extend to containers as well.
Besides, you were missing the point which was this: Because ObjC is late bound, you do not require any mappings for templating syntax because there is no such thing.
As you wrote above, it's also about Objective-C as a target language. In this case, it is relevant how to map such features into it. (As far as I see so far, there is no direct mapping, so it would all have to be expanded in the frontend and mapped in a low-level, C-like way.)
In Objective-C constructors and destructors are just messages, so there is no need for special syntax and translations.
foo := [[FooClass alloc] initWithFoo: 123];
The emphasis was on automatic. So can you, as a class designer:
- Enforce certain actions that are always done on instantiation
 (e.g. ensure that certain object fields are always initialized to  certain values, which don't have to be constant, but might be a  running index or something). I imagine this might be possible by  implementing an instantiation method in the class that does this,  but I'm not sure.
Indeed.
By convention, ObjC classes have a method for allocation (called "alloc") and at least one method for initialisation (called "init"). These methods are by default inherited from the super-class. However, you can of course override them with class specific implementations. Typically, a class specific implementation of init would first invoke the super-class' init method and then carry out any class specific initialisations thereafter.
OK.
In any event, a class is always instantiated by sending and alloc message to it and then sending an init message to the newly allocated object. There is no special syntax.
Thus, when you see a class instantiation such as ...
foo = [FooClass alloc] init];
if you did implement a FooClass specific init method, then the init message will invoke that method
if you did not implement a FooClass specific init method, then the init message will invoke the init method inherited from the FooClass' super-class.
Overriding an inherited method is as simple as declaring and implementing the method just like any other method.
But if you just call [FooClass alloc], then init will not be called? This means for the class designer, if you want to enforce something to happen for all new objects, you put it in alloc, not init, right?
- Enforce certain actions that are always done on deletion of an
 object, whichever way this happens.
Same principle as with initialisation.
Again by convention, ObjC classes have a method for retaining an object (called "retain") and a method for releasing and object (called "release"). In addition there is also a method called "autorelease". Again, these methods are inherited from the super-class.
When you send a retain message to an object, its reference count is incremented, when you send a release message to it, its reference count is decremented. If as a result of your release message the reference count goes to zero, the object is deallocated. Thus, even if you send a release message, should there be still other objects who retained the object, it will not immediately be deallocated but only when the last retaining object sent a release message.
Again, you can override the inherited methods with class specific implementations that perform additional tasks.
OK.
In any event, anything that happens to an object during its lifetime from allocation to deallocation happens to it by sending messages to it which invokes methods implemented in the object's class or any of its super-classes (up to the root class). Any of these methods can be overridden. Any of these messages can be intercepted and redirected. The extend of your control is only limited by your knowledge of the API. There is no hidden magic anywhere that only the compiler has access to. Everything is accessible and controllable from within your own ObjC code.
You may notice that I don't consider compiler magic a bad thing in general, but IMHO it can be useful. (And I don't think I'm alone in this -- Pascal does much more "behind the scenes" than C, and most Pascal programmers seem to like it.)
So if you allocate an object, never "retain" or "release" it, and just let it go out of scope, no method will be called (or message be sent), so the class designer has no way to properly clean up?
For the avoidance of confusion I should perhaps mention that when you see other messages being used to instantiate, for example ...
foo = [FooClass new];
or
foo = [FooClass newWithBar: 123 andBaz: 456];
or
foo = [FooClass newFromArray: bar];
these are messages implemented by methods on top of alloc and init. For example method new would invoke alloc and init internally. In other words it is just a convenience method. It is customary for class implementors to provide several newWith... or newFrom... methods as they see fit.
Sure. (In C++ you do the same by implementing several constructors or allocation functions -- which would be global functions, not class-methods, though.)
[fooException raise];
Does this do what raising exceptions in other languages do, i.e. "jump" to the nearest enclosing appropriate catch-block? Or does it just send the message and continue with the next statement?
Depending on how your exception handling code handles the exception. It may resume, it may re-raise or raise another exception or it may abort. Whatever you see fit.
So for the "normal" case (compared to other EH languages), the EH code would jump to the respective catch-block, but this is implemented in plain code? So I assume there's some sort of "nonlocal goto" primitive that's used to implement it?
Frank