Re: Probs with variant records

22 Nov 2002


      I've followed this thread so far with interest, sometimes with
amazement ... I will reply to some fragments of several mails below,
but before, some general comments.
I think some of you (especially Chuck ;-) are really seeing the
issue more difficult than it really is. The difficulties you see
might stem from two reasons:
- You tend to see things at too low a level (we've had this
  discussion before). When you're already on the assembler level and
  dealing with offsets etc., it might be hard, indeed. But when you
  stay at the Pascal level, many things look (and are ;-) much
  easier. This also holds for the actual implementation, BTW. We
  don't have to hand-code assembler code for the checks. We already
  have a backend which can produce and optimize code for all kinds
  of expressions and statements (represented in an internal form
  which is close enough to Pascal code, so for the purpose of this
  discussion we can assume we actually insert Pascal code for the
  runtime checks).
- Mixing up of different kinds of runtime checks (that applies to
  many of you). It's clear that GPC currently does very little
  runtime checks, and it should be improved. But, it doesn't help to
  mix them all together -- one tends to lose overview, and they'll
  have to be implemented separately, anyway.
In particular I can imagine the following kinds of checks (and,
  BTW, it's clear that all runtime checks will be disableable (is
  this a word? ;-) by compiler options -- there's not really a need
  to repeat this in every such discussion):
- Range checks -- I use this term in the following strict sense
    because I'm used to it from other environments; if someone has a
    better suggestion, let me know: A value of an ordinal type is
    used in a situation where only a subrange of this type is
    acceptable (e.g. assignments, parameters, array indexing, ...)
- Overflow checks: The result of some operation does not fit in
    the target type (e.g., adding two large integers). That's
    different from range checking, since here the result is not even
    a valid integer type. Whereas range checks can be done during
    the assignment etc., overflow checking must be done in
    connection with the operation itself before it yields a wrong
    value. (So, e.g., when you `Write' the result, overflow checking
    might still need to be done, whereas no range checking is ever
    necessary there.)
- Variant checks: Also quite different from range checks. This is
    about whether some field can be accessed at all (at runtime),
    independent of what's done with the field
    (reading/writing/passing).
- nil/invalid pointer checks
- undefined value checks
- and probably much more ...
J. David Bryan wrote:
...
GPC behaves correctly now, but it would perhaps be more useful if variant
assignments were checked at run time as part of the general implementation 
of range checking (which is on the "to do" list).
As just explained, variant checking is not part of general range
checking and it was not on the list AFAICS. I'm putting it there
now.
...
...
It is even impossible if the variant selector is not stored....
Well, yes, but I don't see why that would prevent implementation in the 
case where it was stored.  In the latter case, it should be possible at run 
time to verify before access that the selector is set to a proper value.
If the selector contains no tag-field and is not a
discriminant-identifier, it is still possible to store the selector
(the standard doesn't say what's stored or where). When allocating
with `New' and a selector given, it can be stored. When a variant is
assigned to, the corresponding selector value can be stored. When
reading from a variant, it can be checked against the stored
selector (if none was stored, it's an error, anyway). Of course,
that's a little more difficult than in the case where there's an
explicit tag-field or a discriminant, but it should be possible.
However, the issue also borders to (general) undefined value checks,
to catch errors such as the following which a simple variant check
would miss:
program Foo;
var
  a: record
     case b: Integer of
       1: (c: Integer);
       2: (d: Integer)
     end;
begin
  a.b := 1;
  a.c := 2;
  a.b := 2;
  a.b := 1;
  WriteLn (a.c)  { WRONG }
end.
A way to go might be to define a magic value for "undefined" for
each type and store this value into each variant field when a
variant becomes active. Of course, this really is quite some
overhead ...
Prof. A Olowofoyeku (The African Chief) wrote:
...
I think I understand the point being made. If you write something
like "r.a := b;", how does the compiler perform the range check 
(either at compile time or at runtime)? The value of "b" could be
anything and might have been set at any point in the code, in any 
number of ways (which might not be obvious at compile time, because 
it might depend on user input). And, presumably, "b" might be a 16-
bit integer, and so the value of "b" could be "19000" or whatever. Or 
am I missing something?
Others have replied to that already, but I really don't see why you
see a problem there. Of course, the actual value might come from
anywhere -- that's why it's a runtime check. At the time of
assignment, the actual value is known (obviously) and therefore can
be checked. I also don't see what difference a 16-bit integer would
make. It has a value just as well, and this value can be inside or
out of the range of the target, whether that is an 8, 16, 32 or
however large value. (In some cases it's known at compile time that
it can't be out of range, but this only simplifies things.)
J. David Bryan wrote:
...
To implement a check for this requirement, the compiler might modify the
above range-checking pseudo-code to:
if rcd.a = 1
    then
      if something >= -5 and something <= 512
        then rcd.twobyte := something
        else range_error
    else
      variant_error
Exactly. I'd prefer to rearrange this a little, taking into account
that the error procedures don't return:
if rcd.a <> 1 then variant_error;
   if (something < -5) or (something > 512) then range_error;
   rcd.twobyte := something;
This makes it quite clear that both checks (range and variant) are
completely independent, and can be individually turned on and off
(even the order in which they're done doesn't matter).
Of course, there are cases where some checks can be omitted by
compile-time knowledge. Some cases are easy to detect, some not so
easy (you get the full scale from a constant comparision to the
halting problem ...). Doing the optimal thing in this regard (if
there is an optimal thing) is very hard. As far as I'm concerned,
when I do it, I'll only do the simplest optimizations (basically,
constant comparisons), and perhaps the backend will catch a few
more. Anything else is IMHO not worth the effort since we're only
talking about an optional debugging aid. For the occasional tight
loop where speed really matters, the programmer might want to choose
types etc. in order to help the compiler avoid unnecessary checks,
rather than hoping for some complicated optimizations.
CBFalconer wrote:
...
This logic can probably be simplified with unrestricted set size.
But consider:
r         = RECORD
                 CASE i : integer OF
0..1000, 10000:    (c : char);
-9999, 3000..4000: (j : integer);
               END; (* r *)
Note the negative value!!
Also here, others have replied, but I wonder what's so special about
the negative value. Comparing the selector with -9999 is just as
easy as comparing with 1.
...
I remain unconvinced that ranges in variant selectors are a GOOD
THING.
One example that comes to mind might be a parse tree for arithmetic
expressions. The variant selector could be an enum type that
represents the operator. Since all binary operators have the same
fields, it would be natural to put them in one variant, and a range
(provided the binary operators are in sequence) is just a little
easier to maintain than a list of values.
J. David Bryan wrote:
...
But consider a "regular" (executable) CASE statement appearing in the body
of a program.  To decide whether a specific contained statement is to be 
executed, the same sequence of possibly disjoint checks must be made.  The 
only difference between this and the variant access situation is the action 
taken after the check: a transfer of execution or an allowed field access.  
So the checking code already exists in GPC; it "simply" :-) needs to be 
emitted in association with a variant field access.
Not exactly. In a `case' statement, the compiler produces code for
all variants at once (possibly using jump tables etc.), for a
variant access we need to check only one case. In fact, the code to
be used here is quite similar to that emitted for an `in' expression
whose right operand is a set constructor.
...
Essentially, yes, although I would imagine that the loop would be unrolled
to perform sequential checks to improve speed.
Indeed (or rather, a series of checks would be generated right away
instead of a table and a loop; that's also easier to do).
...
The GNAT (Ada) compiler
actually puts the selector-checking code for a specific variant into a 
compiler-generated function, so that accesses to any fields contained 
within that variant may be checked by calling the common routine (i.e., 
avoiding the replication of checking code at each field reference within 
the program).
I doubt whether that's too useful. The cases where the function call
overhead might be worth it (very complex lists of selectors/ranges)
are probably very rare in practice, so it doesn't seem worth
optimizing for them.
[Assembler stuff skipped. This goes to you, too -- if you think
something can be improved in the assembler code, then please contact
the respective backend maintainers. Any improvement made there at
the low level would benefit all languages and all situations (e.g.,
explicit comparisons as well as automatic checks). Here, let's
please stay at the high level.]
Frank
-- 
Frank Heckenbach, frank@g-n-u.de, http://fjf.gnu.de/, 7977168E
GPC To-Do list, latest features, fixed bugs:
http://www.gnu-pascal.de/todo.html
GPC download signing key: 51FF C1F0 1A77 C6C2 4482  4DDC 117A 9773 7F88 1707

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: Probs with variant records