Frank Heckenbach wrote:
CBFalconer wrote:
Pascal is somewhat unique in the way it encourages use of input streams. All input parsing leaves the terminating char in f^, so that the reason for rejection, or the existance of further data in a line, can be detected from that and the eoln condition. [...]
Once you have this sort of input routine, there is very little need for a string parsing mechanism. The problem is largely that C programmers are used to such a technique, and many will be lost without it.
I don't follow your last paragraph. Do you mean that any input data must come from files, or that if you have data in a string, you'd rather want to put it to a (temporary?) file first to parse it?
No, I mean that C programmers have no built-in way of dealing with streams other than with scanf, which is normally a horror for error recovery. So they prefer to read complete strings (normally requiring a presized buffer [1]) and do conversions from there.
[1] Apart from using my ggets, available at: http://cbfalconer.home.att.net/download/ggets.zip
... snip ...
So my question is really not whether to convert between strings and numbers, but only how to (most elegantly). Numbers to strings are easy to do with standard functions (such as `Integer2String') because the integer is a value parameter and thus quite flexible, and there are no error conditions (since the result type can be chosen big enough to hold any possible value). Strings to numbers is a little more tricky, since there are error conditions, so the result and the error cannot both be the result.
The conversion routines for this normally pre-exist to implement write(textfile, ...). The only thing necessary is to extend that mechanism to strings. I would consider a cleaner method to be providing a way to attach a file to an internal string. However, as you say, the need to perform reset/rewrite on such a 'stringfile' might be exorbitant. Thus a family of sread and swrite(string, index, ....) might be easier. Again, they would draw on the same underlying routines. The usual colon syntax for field size etc. would be useful.
To me, once a programmer has learned to use the Pascal i/o facilities (read, write, readln, writeln, put, get, f^, eof, eoln) he should not need to learn anything more. Adding a letter to the routine names, and substituting "string, index" for "file" covers the added knowledge to deal with string conversions. That is why I simply added the letter x to provide error returns in read. String writing will need the error returns to handle overflowing the string capacity.
AFAICS read and write are the only routines that are 'overloaded' in Pascal, and that overloading is only apparent, since the compiler expands them to non-overloaded calls.
Therefore I would try to make any such string parser logically equivalent to the use of readx*(f, var), maybe with sreadx*(s, var). There is no need for seof, and seoln(s) can detect the string end.
If anything, then otherwise. Though it may seem unfamiliar to you, a string can store several lines of text. Even standard Pascal doesn't prohibit an "end of line" character value. (It does require reading a char from a `Text' file to yield space at end-of-line, but not so for `file of Char'.)
True. Then we need a seof routine.
It may be useful to include a var parameter for the current string index, by which time the function looks much like your proposal above,
Not really. My intention (with the Boolean return value) is, e.g., to be able to use the function in conditionals, and not having to declare an extra variable, when (as most of the time) I just want to convert a string to a number and reject anything invalid. (Corrective action can be nice sometimes, but since it's necessarily heuristic, I often find it better to just reject invalid input and let the user correct things.)
I include 'let the user' in corrective actions.
In PascalP I used the rule that extensions should almost always be in the form of standard functions or procedures, so that the only porting effort to purely standard Pascal was the writing of such functions. Thus there were no such things as unsigned integers, instead there was the "uadd(a, b : integer) : integer;" function, which ignored overflow and implemented the usual modulus rules.
I see your point about porting, but apart from that, this idea seems quite uncomfortable to me. When you want to have an "unsigned integer" this way, if I get you right, you'd declare is as a regular `Integer', but you had to remember to always use `uadd', `uread', `uwrite' (or whatever you call them, i.e. special I/O routines that handle unsigned integers) and similar for all the other operations.
IMHO, this doesn't look very nice (not being able to use operators for `+', `-' etc., or `ReadLn'/`WriteLn' with a sequence of values), and it's quite dangerous because the compiler can't tell you when you once forget to use the special routines. In fact it's what assembler programmers have to do (there are no signed and unsigned types in assembler, just signed and unsigned operations).
Depends on the machine. Some have unsigned and signed operators, with overflow in signed operators triggering traps.
There's no question that any extension *can* be written in a standard declaration scheme, but is it always a good idea? Isn't this just the C way where any "compiler magic" is frowned upon, and even the dubious varargs declarations were introduced, just so `printf' etc. could be declared in a standard way? IMHO, Pascal (or perhaps any higher-level language) is also about writing things in a more comfortable syntax and to let the compiler/interpreter do many nasty little things for you (such as here remembering whether your variable is signed or unsigned, once declared).
I second your attitude to varargs. However I have usually found that unsigned operations are relatively rare, given proper subrange typing. Depending on how the typing actually works, the definition "TYPE unsigned = integer;" should avoid misuse. Making unsigned operations stand out avoids many silly errors. To me, the compiler magic to which you object is using the same operators as for integers.