Hi all,
I know short strings (255 character limit, 0 index containing the length) have been discussed to death, and I'm not going to discuss them further, but I did want to point out for those of you not on the MacPascal mailing list that everyone asking "Which should I use, FPC or GPC?" is basically ending up with FPC primarily because of lack of short string support.
I have nothing against FPC, and indeed the developers have been great at supporting Mac compatibility and so forth.
But since I use GPC, and since GPC has lots of really cool other features (like long Strings for example ;-), I'd love to encourage people to use GPC. I personally believe the little extra work required to use wrappers for when short strings are needed is no big deal, but it's hard to argue that "its just a little more work" to a developer who is already faced with making a big change in their environment and code.
So basically, I'm just saying that the lack of short strings is seriously hindering GPC adoption on the Mac, and that that should probably be taken in to account when pondering the priority of their implementation.
Thanks, Peter.
Peter N Lewis wrote:
... snip ...
So basically, I'm just saying that the lack of short strings is seriously hindering GPC adoption on the Mac, and that that should probably be taken in to account when pondering the priority of their implementation.
You should be using the standard functions for access to the length parameter, after which you don't care how strings are implemented. The so-called short strings are implementable as records if compatibility with existing binaries or files is required.
One of the major points of Pascal is simply to eliminate machine dependence, and rely on abstractions. Thus integers are fully described by maxint, and smaller ranges should be typed as subranges, etc.
(BCC to "Jon" who asked basically the same in PM, perhaps related. BCC as I don't know if you want your address publicly known, but this way you won't get further replies, unless you subscribe to the list or read the archives.)
CBFalconer wrote:
Peter N Lewis wrote:
... snip ...
So basically, I'm just saying that the lack of short strings is seriously hindering GPC adoption on the Mac, and that that should probably be taken in to account when pondering the priority of their implementation.
I see. When I have more than a few days to work at GPC (currently, I'm stilly busy with other work, you know, the paying kind ...), I'll see if I can do it. Unless Waldek beats me to it ...
On a tangent, what do you think is the future of short strings anyway? For one, there's the 255 chars limitation. While still enough for many purposes, it might not be for some, e.g. file names with several long directories etc. could get longer. Does Mac OS support those at all, with a different interface, or what? Secondly, charsets. When 16-bit charsets (i.e., Unicode) become common (or are already), will/should there be a 16-bit "short" string (i.e., max. 65535 characters)? Or else, if the interfaces will/do use UTF-8, we'll need conversions (automatic and/or manual) between Unicode and UTF-8 anyway? (As discussed before, UTF-8 doesn't make for a string type in Pascal.) Just to figure out if supporting short strings will only be a short-lived solution and more work in this area will be required soon anyway ...
You should be using the standard functions for access to the length parameter, after which you don't care how strings are implemented. The so-called short strings are implementable as records if compatibility with existing binaries or files is required.
One of the major points of Pascal is simply to eliminate machine dependence, and rely on abstractions. Thus integers are fully described by maxint, and smaller ranges should be typed as subranges, etc.
This is true on a high-level, when no external interfaces are a concern. When it comes to binary file formats (as you mentioned), or network protocols, or external binary interfaces (libraries or, like here, the OS), things such as storage size, alignment and byte order matter, and those are not described in Pascal.
BTW, in the case of short strings, they're better described as an array, starting from 0, with the Ord of the 0'th element representing the length, as e.g. BP does explicitly. A record would need an Integer field for the length, and, apart from alignment etc., it's not at all certain this is of the same size as a Char, even if declared as a subrange (e.g., in GPC it wouldn't be -- except for a packed record, which in turn wouldn't e.g. allow passing the elements by reference which e.g. BP short strings do allow).
AIUI, the Mac Pascal interfaces make heavy use of such short strings. One option would be conversion to/from them for every input/output parameter (wouldn't work for true reference parameters -- don't know if that's required). This would generally require copying the whole string. Ironically, converting to C-Strings (input parameters only, as long as the strings do not contain Chr (0)) is easier, since one can keep the string in place and only has to add a Chr (0) if space is reserved in advance (as GPC does). That's why we have less of such problems with C-string based OS interfaces (POSIX, Dos, Windows, ...), as most strings parameters are input, and extra work is only required for the few other cases.
Frank
At 1:09 +0200 4/7/06, Frank Heckenbach wrote:
CBFalconer wrote:
Peter N Lewis wrote:
So basically, I'm just saying that the lack of short strings is seriously hindering GPC adoption on the Mac, and that that should probably be taken in to account when pondering the priority of their implementation.
On a tangent, what do you think is the future of short strings anyway? For one, there's the 255 chars limitation. While still enough for many purposes, it might not be for some, e.g. file names with several long directories etc. could get longer. Does Mac OS support those at all, with a different interface, or what? Secondly, charsets. When 16-bit charsets (i.e., Unicode) become common (or are already), will/should there be a 16-bit "short" string (i.e., max. 65535 characters)? Or else, if the interfaces will/do use UTF-8, we'll need conversions (automatic and/or manual) between Unicode and UTF-8 anyway? (As discussed before, UTF-8 doesn't make for a string type in Pascal.) Just to figure out if supporting short strings will only be a short-lived solution and more work in this area will be required soon anyway ...
The current and future APIs use CFStrings (CF = Core Foundation), which are an opaque (or real under Objective C) object that handles strings with arbitrary lengths and encodings. Pretty much all future uses of strings with the API will be CFStrings, which I tell people when advocating for GPC. But that doesn't help them porting to GPC in the first place (and as I'd also advocate, make changes one at a time where possible (eg, port to GPC, get it working, then port to newer APIs), that makes for a conflict). There remains a lot of old "short string" uses in the APIs and quite often in file structures as well (since storing a fixed Str31 in a fixed 32 bytes is a lot easier than having to deal with a variable data structure, even if it does waste space, often ease of programming outweighs minimum storage space).
AIUI, the Mac Pascal interfaces make heavy use of such short strings. One option would be conversion to/from them for every input/output parameter (wouldn't work for true reference parameters -- don't know if that's required). This would generally require copying the whole string. Ironically, converting to C-Strings (input parameters only, as long as the strings do not contain Chr (0)) is easier, since one can keep the string in place and only has to add a Chr (0) if space is reserved in advance (as GPC does). That's why we have less of such problems with C-string based OS interfaces (POSIX, Dos, Windows, ...), as most strings parameters are input, and extra work is only required for the few other cases.
Yes, there are basically two ways of handling it in GPC currently - either use fake short strings throughout your program, converting to/from real GPC strings as needed for string operations and using the overloaded operators and such, or convert to/from fake short strings for API/file system access. Adriaan advocates the former, I advocate the latter, but either one makes for a fair amount of work initially. After you get the work done and get used to it, it is not too bad, but that doesn't sway newcomers making a choice between an easy FPC port or a harder GPC port. Explaining all the Extended Pascal goodness in GPC rarely sways people in such a position.
In any event, as I said up front, I'm not intending to advocate that short strings are wonderful or that there aren't any other alternatives, or any such thing - I love the long string support in GPC, it more than made up from the short term pain of conversion. What I am saying is that it is turning away people with traditional Mac code to port before they get a chance to find out how cool GPC is.
Thanks, Peter.
Joseph Ayers wrote:
May I put in a vote for short (UCSD Pascal) string support. I'm trying to revive several legacy programs based on the Think Pascal, Codewarrior development environments. See:
http://rsb.info.nih.gov/nih-image/ for an example.
Based on my experience so far, I'd say that 80% of the conversion effort has been dealing with lack of short string support. Concat being a prime culprit with variable argument lists requiring major hand coded efforts.. The legacy code base in these environments is legion in the academic arena.
I see. (BTW, a quick, though not really nice, workaround for Concat might be just to provide Concat2, Concat3, etc., as many as needed, that do the conversions. Then the main code would only need inserting a number on each call, which could be removed again quite automatically later ...)
Peter N Lewis wrote:
In any event, as I said up front, I'm not intending to advocate that short strings are wonderful or that there aren't any other alternatives, or any such thing - I love the long string support in GPC, it more than made up from the short term pain of conversion. What I am saying is that it is turning away people with traditional Mac code to port before they get a chance to find out how cool GPC is.
OK. BTW, I wasn't saying that short strings are evil, just really trying to find out how they're used and needed (as I wrote in the other mail, for former BP programmers such as myself, the issues were much smaller).
Frank