The variable-records discussion on this list has revealed the
difficulties of checks on variable records. As a sidebar, I recall that
Niklaus Wirth has replaced variant records later (in Oberon) with the
much more powerful notion of type extensions. Type extensions can
easily be checked, both compile-time and run-time.
Adriaan van Os
-------------------
[taken from a document by Niklaus Wirth, entitled "From Modula to
Oberon"]
Type extension
The most important addition is the facility of extended record types.
It permits the construction of new types on the basis of existing
types, and establishes a certain degree of compatibility between the
new and old types. Assuming a given type
T = RECORD x, y: INTEGER END
extensions may be defined which contain certain fields in addition to
the existing ones. For example
TO = RECORD (T) z: REAL END
Ti = RECORD (T) w: LONGREAL END
define types with fields x, y, z and x, y, w respectively. We define a
type declared by
T' = RECORD (T) <field definitions> END
to be a (direct) extension of T, and conversely T to be the (direct)
base type of T'. Extended types may be extended again, giving rise to
the following definitions:
A type T' is an extension of T, if T'= T orT' is a direct extension of
an extension of T. Conversely, T is a base type of T', if T = T' or T
is the direct base type of a base type of T'. We denote this
relationship by T' -> T.
The rule of assignment compatibility states that values of an extended
type are assignable to variables of their base types. For example, a
record of type TO can be assigned to a variable of the base type T.
This assignment involves the fields x and y only, and in fact
constitutes a projection of the value onto the space spanned by the
base type.
It is important to allow modules which import a base type to be able to
declare extended types. In fact, this is probably the normal usage.
This concept of extensible data type gains importance when extended to
pointers. It is appropriate to say that a pointer type P' bound to T'
extends a pointer type P, if P is bound to a base type T of T', and to
extend the assignment rule to cover this case. It is now possible to
form data structures whose nodes are of different types, i.e.
inhomogeneous data structures. The inhomogeneity is automatically (and
most sensibly) bounded by the fact that the nodes are linked by
pointers of a common base type.
Typically, the pointer fields establishing the structure are contained
in the base type T, and the procedures manipulating the structure are
defined in the same (base) module as T. Individual extensions
(variants) are defined in client modules together with procedures
operating on nodes of the extended type. This scheme is in full
accordance with the notion of system extensibility: new modules
defining new extensions may be added to a system without requiring a
change of the base modules, not even their recompilation.
As access to an individual node via a pointer bound to a base type
provides a projected view of the node data only, a facility to widen
the view is necessary. It depends on the ability to determine the
actual type of the referenced node. This is achieved by a type test, a
Boolean expression of the form
t IS T' (or p IS P')
If the test is affirmative, an assignment t':=t(t'of typeT') or p':=
p(p'of type P') should be possible. The static view of types, however,
prohibits this. Note that both assignments violate the rule of
assignment compatibility. The desired assignment is made possible by
providing a type guard of the form
t' := t(T') (p' := p(P'))
and by the same token access to the field z of a TO (see previous
examples) is made possible by a type guard in the designator t (TO).z.
Here the guard asserts that t is (currently) of type TO. In analogy to
array bound checks and case selectors, a failing guard leads to program
abortion.
Whereas a guard of the form t(T) asserts that t is of type T for the
designator (starting with) t only, a regional type guard maintains the
assertion over an entire sequence of statements. It has the form
WITH t: T DO StatementSequence END
and specifies that t is to be regarded as of type T within the entire
statement sequence. Typically, T is an extension of the declared type
of t. Note that assignments to t within the region therefore require
the assigned value to be (an extension) of type T. The regional guard
serves to reduce the number of guard evaluations.
As an example of the use of type tests and guards, consider the
following types Node and Object defined in a module M:
TYPE Node = POINTER TO Object;
Object = RECORD key, x, y: INTEGER; left, right: Node END
Elements in a tree structure anchored in a variable called root (of
type Node) are searched by the procedure element defined in M.
PROCEDURE element(k: INTEGER): Node;
VAR p: Node;
BEGIN p:= root;
WHILE (p # NIL) & (p.key # k) DO
IF p.key < k THEN p:= p.left ELSE p:= p.right END
END;
RETURN p
END element
Let extensions of the type Object be defined (together with their
pointer types) in a module M 1 which is a client of M:
TYPE
Rectangle = POINTER TO RectObject;
RectObject = RECORD (Object) w, h: REAL END;
Circle = POINTER TO CircleObject;
CircleObject = RECORD (Object) rad: REAL; shaded: BOOLEAN END
After the search of an element, the type test is used to discriminate
between the different extensions, and the type guard to access
extension fields. For example:
p:= M.element(K);
IF p # NIL THEN
IF p IS Rectangle THEN ... p(Rectangle).w ...
ELSIF (p IS Circle) & ~p(Circle) shaded THEN ... p(Gircle).rad ...
ELSIF ...
The extensibility of a system rests upon the premise that new modules
defining new extensions may be added without requiring adaptations nor
even recompilation of the existing parts, although components of the
new types are included in already existing data structures.
The type extension facility not only replaces Modula's variant records,
but represents a typesafe alternative. Equally important is its effect
of relating types in a type hierarchy. We compare, for example, the
Modula types
TO' = RECORD t: T; z: REAL END;
TV = RECORD t: T; w: LONGREAL END
which refer to the definition of T given above, with the extended
Oberon types TO and T1 defined above. First, the Oberon types refrain
from introducing a new naming scope. Given a variable rO of type TO, we
write rO.x instead of rO.t.x as in Modula. Second, the types T, TO',
and T1' are distinct and unrelated. In contrast, TO and T1 are related
to T as extensions. This becomes manifest through the type test, which
asserts that variable rO is not only of type TO, but also of base type
T.
The declaration of extended record types, the type test, and the type
guard are the only additional features introduced in this context. A
more extensive discussion is provided in [2]. The concept is very
similar to the class notion of Simula 67 [3], Smalltalk [4], Object
Pascal [5], C++ [6], and others, where the properties of the base class
are said to be inherited by the derived classes. The class facility
stipulates that all procedures applicable to objects of the class be
defined together with the data definition. This dogma stems from the
notion of abstract data type, but it is a serious obstacle in the
development of large systems, where the possibility to add further
procedures defined in additional modules is highly desirable. It is
awkward to be obliged to redefine a class solely because a method
(procedure) has been added or changed, particularly when this change
requires a recompilation of the class definition and of all its client
modules.
We emphasise that the type extension facility - although gaining its
major role in connection with pointers to build heterogeneous, dynamic
data structures as shown in the example above - also applies to
statically declared objects used as variable parameters. Such objects
are allocated in a workspace organized as a stack of procedure
activation records, and therefore take advantage of an extremely
efficient allocation and deallocation scheme.
In Oberon, procedure types rather than procedures (methods) are
connected with objects in the program text. The binding of actual
methods (specific procedures) to objects (instances) is delayed until
the program is executed. The association of a procedure type with a
data type occurs through the declaration of a record field. This field
is given a procedure type. The association of a method - to use
Smalltalk terminology - with an object occurs through the assignment of
a specific procedure as value to the field, and not through a static
declaration in the extended type's definition which then "overrides"
the declaration given in the base type. Such a procedure is called a
handler. Using type tests, the handler is capable of discriminating
among different extensions of the record's (object's) base type. In
Smalltalk, the compatibility rules between a class and its subclasses
are confined to pointers, thereby intertwining the concepts of access
method and data type in an undesirable way. In Oberon, the relationship
between a type and its extensions is based on the established
mathematical concept of projection.
In Modula, it is possible to declare a pointer type within an
implementation module, and to export it as an opaque type by listing
the same identifier in the corresponding definition module. The net
effect is that the type is exported while all its properties remain
hidden (invisible to clients). In Oberon, this facility is generalized
in the sense that the selection of the record fields to be exported is
arbitrary and includes the cases all and none. The collection of
exported fields defines a partial view - a public projection - to
clients.
In client modules as well as in the module itself, it is possible to
define extensions of the base type (e.g. TextViewers or GraphViewers).
Of importance is also the fact that non-exported components (fields)
may have types that are not exported either. Hence, it is possible to
hide certain data types effectively, although components of (opaquely)
exported types refer to them.
-------------------