May I put in a vote for short (UCSD Pascal) string support. I'm trying to revive several legacy programs based on the Think Pascal, Codewarrior development environments. See:
http://rsb.info.nih.gov/nih-image/ for an example.
Based on my experience so far, I'd say that 80% of the conversion effort has been dealing with lack of short string support. Concat being a prime culprit with variable argument lists requiring major hand coded efforts.. The legacy code base in these environments is legion in the academic arena.
Joseph Ayers
(BCC to "Jon" who asked basically the same in PM, perhaps related. BCC as I don't know if you want your address publicly known, but this way you won't get further replies, unless you subscribe to the list or read the archives.)
CBFalconer wrote:
Peter N Lewis wrote:
... snip ...
So basically, I'm just saying that the lack of short strings is seriously hindering GPC adoption on the Mac, and that that should probably be taken in to account when pondering the priority of their implementation.
I see. When I have more than a few days to work at GPC (currently, I'm stilly busy with other work, you know, the paying kind ...), I'll see if I can do it. Unless Waldek beats me to it ...
On a tangent, what do you think is the future of short strings anyway? For one, there's the 255 chars limitation. While still enough for many purposes, it might not be for some, e.g. file names with several long directories etc. could get longer. Does Mac OS support those at all, with a different interface, or what? Secondly, charsets. When 16-bit charsets (i.e., Unicode) become common (or are already), will/should there be a 16-bit "short" string (i.e., max. 65535 characters)? Or else, if the interfaces will/do use UTF-8, we'll need conversions (automatic and/or manual) between Unicode and UTF-8 anyway? (As discussed before, UTF-8 doesn't make for a string type in Pascal.) Just to figure out if supporting short strings will only be a short-lived solution and more work in this area will be required soon anyway ...
You should be using the standard functions for access to the length parameter, after which you don't care how strings are implemented. The so-called short strings are implementable as records if compatibility with existing binaries or files is required.
One of the major points of Pascal is simply to eliminate machine dependence, and rely on abstractions. Thus integers are fully described by maxint, and smaller ranges should be typed as subranges, etc.
This is true on a high-level, when no external interfaces are a concern. When it comes to binary file formats (as you mentioned), or network protocols, or external binary interfaces (libraries or, like here, the OS), things such as storage size, alignment and byte order matter, and those are not described in Pascal.
BTW, in the case of short strings, they're better described as an array, starting from 0, with the Ord of the 0'th element representing the length, as e.g. BP does explicitly. A record would need an Integer field for the length, and, apart from alignment etc., it's not at all certain this is of the same size as a Char, even if declared as a subrange (e.g., in GPC it wouldn't be -- except for a packed record, which in turn wouldn't e.g. allow passing the elements by reference which e.g. BP short strings do allow).
AIUI, the Mac Pascal interfaces make heavy use of such short strings. One option would be conversion to/from them for every input/output parameter (wouldn't work for true reference parameters -- don't know if that's required). This would generally require copying the whole string. Ironically, converting to C-Strings (input parameters only, as long as the strings do not contain Chr (0)) is easier, since one can keep the string in place and only has to add a Chr (0) if space is reserved in advance (as GPC does). That's why we have less of such problems with C-string based OS interfaces (POSIX, Dos, Windows, ...), as most strings parameters are input, and extra work is only required for the few other cases.
Frank
-- Frank Heckenbach, f.heckenbach@fh-soft.de, http://fjf.gnu.de/, 7977168E GPC To-Do list, latest features, fixed bugs: http://www.gnu-pascal.de/todo.html GPC download signing key: ACB3 79B2 7EB2 B7A7 EFDE D101 CD02 4C9D 0FE0 E5E8 (BCC to "Jon" who asked basically the same in PM, perhaps related. BCC as I don't know if you want your address publicly known, but this way you won't get further replies, unless you subscribe to the list or read the archives.)
CBFalconer wrote:
Peter N Lewis wrote:
... snip ...
So basically, I'm just saying that the lack of short strings is seriously hindering GPC adoption on the Mac, and that that should probably be taken in to account when pondering the priority of their implementation.
I see. When I have more than a few days to work at GPC (currently, I'm stilly busy with other work, you know, the paying kind ...), I'll see if I can do it. Unless Waldek beats me to it ...
On a tangent, what do you think is the future of short strings anyway? For one, there's the 255 chars limitation. While still enough for many purposes, it might not be for some, e.g. file names with several long directories etc. could get longer. Does Mac OS support those at all, with a different interface, or what? Secondly, charsets. When 16-bit charsets (i.e., Unicode) become common (or are already), will/should there be a 16-bit "short" string (i.e., max. 65535 characters)? Or else, if the interfaces will/do use UTF-8, we'll need conversions (automatic and/or manual) between Unicode and UTF-8 anyway? (As discussed before, UTF-8 doesn't make for a string type in Pascal.) Just to figure out if supporting short strings will only be a short-lived solution and more work in this area will be required soon anyway ...
You should be using the standard functions for access to the length parameter, after which you don't care how strings are implemented. The so-called short strings are implementable as records if compatibility with existing binaries or files is required.
One of the major points of Pascal is simply to eliminate machine dependence, and rely on abstractions. Thus integers are fully described by maxint, and smaller ranges should be typed as subranges, etc.
This is true on a high-level, when no external interfaces are a concern. When it comes to binary file formats (as you mentioned), or network protocols, or external binary interfaces (libraries or, like here, the OS), things such as storage size, alignment and byte order matter, and those are not described in Pascal.
BTW, in the case of short strings, they're better described as an array, starting from 0, with the Ord of the 0'th element representing the length, as e.g. BP does explicitly. A record would need an Integer field for the length, and, apart from alignment etc., it's not at all certain this is of the same size as a Char, even if declared as a subrange (e.g., in GPC it wouldn't be -- except for a packed record, which in turn wouldn't e.g. allow passing the elements by reference which e.g. BP short strings do allow).
AIUI, the Mac Pascal interfaces make heavy use of such short strings. One option would be conversion to/from them for every input/output parameter (wouldn't work for true reference parameters -- don't know if that's required). This would generally require copying the whole string. Ironically, converting to C-Strings (input parameters only, as long as the strings do not contain Chr (0)) is easier, since one can keep the string in place and only has to add a Chr (0) if space is reserved in advance (as GPC does). That's why we have less of such problems with C-string based OS interfaces (POSIX, Dos, Windows, ...), as most strings parameters are input, and extra work is only required for the few other cases.
Frank
-- Frank Heckenbach, f.heckenbach@fh-soft.de, http://fjf.gnu.de/, 7977168E GPC To-Do list, latest features, fixed bugs: http://www.gnu-pascal.de/todo.html GPC download signing key: ACB3 79B2 7EB2 B7A7 EFDE D101 CD02 4C9D 0FE0 E5E8