On Thu, 15 Nov 2001, CBFalconer wrote:
Expanding, all that is required is a single routine (or in line instruction sequence) as long as chars and booleans are represented by integers, which is called or emitted just before assignment or indexing. The value to be checked is already on the stack, and it would look like:
push min push max call chk
inline definitelly:
cmp min branch_less :range_error cmp max branch_greater :range_error
But when it has to be emited after every add or subtract operation, it can produce significant overhead even if it's only 4 instructions.
You can go even further and reduce this to:
ld r0, x ld r1, y add r1, r0 jump_on_overflow :range_error
(of course, instructions do not match an actual assembler).
The hard part is deciding when *NOT* to emit it. One case is when run-time checks have been disabled, and is easy. The basic idea is to keep track of the possible range of expressions (including unknown). For example:
TYPE ix = 1..maxix; whatever = ....
VAR a : ARRAY[ix] OF whatever; x : ix; y : whatever; i : integer;
.... y = a[x];
needs no checking because x is of type ix, already checked, and a[x] is of type whatever, again already checked. y = a[3] can be checked (for index) at compile time, but y = a[i] needs a check call. The last could be complicated by keeping track of what i can hold, but that would rapidly degenerate into excessive complications. Nothing much more than the immediate sequence i = x; y = a[i]; could be reasonably handled (for check suppression).
One could think of imagining values as 'tainted' to use with less than precision which is minimum to preserve the correct result of combination of operations. Values are checked at the begining and result is "poisoned" until checked. If two bytes are added and result is put in 16 bit word x, x is marked as "clean" to add into 16-bit or larger register, since no range error could possibly occur. But, if you want to put result into a x..y range, result is considered 'tainted' and it must be range-checked.
However, if an "MedInt" is assigned to "Word", the value is 'tainted' until checked.
Then an optimizer could defer 'untainting' data until it's optimal to do it.
For example, each integer type counter increment would have to be checked for overflow - and only by this you would prevent Arianne rocket blast :-) ...
At least as I see it.
There are many good points there. But with range checking there is many room for optimization. For example, in some FOR loops compiler could determine the possible range of a variable.
For expensive loops that don't have constant upper and lower band but variable, this could be done (at the beginning of the expensive loop):
1. compiler checks the minimum bounds for arrays inside loop 2. compiler asserts two versions of loop: one with range-checking, onw without 3. at run-time check finds the maximum bounds for loop variable; - if it's safe for minimum bounds found in 1, loop without bound-checking is called (fast) - if it's determined algorythmic that range-error will occur inevitably in the loop, runtime exception is raised - if neither is true, (for example, there is an conditional exit from the loop) the version of the loop compiled with range checking is performed (slow)
But this may be too much for the compiler to decide, perhabs we'll still have to put speed sensitive parts of code between {$R-} and {$R+} ...
mirsad
-- This message has been made up using recycled ideas and language constructs. No plant or animal has been injured in process of making this message.