CBFalconer wrote:
Mirsad Todorovac wrote:
... snip ...
However, SPARSE SETs seem to me like an excellent challenge for a programmer, but they *do* introduce new issues, for example memory allocation problems - since their size can't be computed at compile time since it depends on actual # of set members; and so on and so on ...
Frank, what do you think of these thoughts of mine? Anybody?
I am a crusty old conservative when it comes to Pascal. I don't think it should be extended to do everything. I don't even believe in sets of over 256 items. The code generation should be kept simple and accurate. If you want a sparse set you can build it yourself, probably out of linked lists of subsets. If you really want large sets an array of subsets is quite adequate.
These are NOT items you use every day. However, SET OF ['0'..'9'] is.
I consider the lack of comprehensive range checking to be a very serious failing. This makes a mockery of taut typing with subranges. With such there is no need to check indices, instead you can rely on the programmer to specify a suitable type for them.
With proper subranging you can keep the cost of run-time checks way down. Studies have shown it to be in the order of 5%.
Expanding, all that is required is a single routine (or in line instruction sequence) as long as chars and booleans are represented by integers, which is called or emitted just before assignment or indexing. The value to be checked is already on the stack, and it would look like:
push min push max call chk
which would normally simply absorb min and max, and let the code continue to the assignment, indexing, function call, etc. For failure, it aborts with appropriate messages. In C terms it would look like:
inline int chk(value, min, max) { if ((value < min) || (value > max)) abort(value, min, max); return value; }
which is adequate because Pascal has no concept of unsigned, outside of subranges of integer. As far as possible everything is passed in registers, but that is only an implementation detail.
For booleans (common) I created an additional routine, chkb, which implied the 0 and 1 limits and avoided the overhead of pushing parameters. I also created chkp, which only checked for a non-NIL pointer and a value known to lie somewhere within the system heap. chkb and chkp need no parameters, but must normally leave the top-of-stack value unaltered. Of course proper use of chkp means the compiler has to distinguish between pointers and VAR references, and is the reason such things are not feasible in C.
The hard part is deciding when *NOT* to emit it. One case is when run-time checks have been disabled, and is easy. The basic idea is to keep track of the possible range of expressions (including unknown). For example:
TYPE ix = 1..maxix; whatever = ....
VAR a : ARRAY[ix] OF whatever; x : ix; y : whatever; i : integer;
.... y = a[x];
needs no checking because x is of type ix, already checked, and a[x] is of type whatever, again already checked. y = a[3] can be checked (for index) at compile time, but y = a[i] needs a check call. The last could be complicated by keeping track of what i can hold, but that would rapidly degenerate into excessive complications. Nothing much more than the immediate sequence i = x; y = a[i]; could be reasonably handled (for check suppression).
At least as I see it.