Type extension

25 Nov 2002


      The variable-records discussion on this list has revealed the 
difficulties of checks on variable records. As a sidebar, I recall that 
Niklaus Wirth has replaced variant records later (in Oberon) with the 
much more powerful notion of type extensions. Type extensions can 
easily be checked, both compile-time and run-time.
Adriaan van Os
-------------------
[taken from a document by Niklaus Wirth, entitled "From Modula to 
Oberon"]
Type extension
The most important addition is the facility of extended record types. 
It permits the construction of new types on the basis of existing 
types, and establishes a certain degree of compatibility between the 
new and old types. Assuming a given type
T = RECORD x, y: INTEGER END
extensions may be defined which contain certain fields in addition to 
the existing ones. For example
TO = RECORD (T) z: REAL END
Ti = RECORD (T) w: LONGREAL END
define types with fields x, y, z and x, y, w respectively. We define a 
type declared by
T' = RECORD (T) <field definitions> END
to be a (direct) extension of T, and conversely T to be the (direct) 
base type of T'. Extended types may be extended again, giving rise to 
the following definitions:
A type T' is an extension of T, if T'= T orT' is a direct extension of 
an extension of T. Conversely, T is a base type of T', if T = T' or T 
is the direct base type of a base type of T'. We denote this 
relationship by T' -> T.
The rule of assignment compatibility states that values of an extended 
type are assignable to variables of their base types. For example, a 
record of type TO can be assigned to a variable of the base type T. 
This assignment involves the fields x and y only, and in fact 
constitutes a projection of the value onto the space spanned by the 
base type.
It is important to allow modules which import a base type to be able to 
declare extended types. In fact, this is probably the normal usage.
This concept of extensible data type gains importance when extended to 
pointers. It is appropriate to say that a pointer type P' bound to T' 
extends a pointer type P, if P is bound to a base type T of T', and to 
extend the assignment rule to cover this case. It is now possible to 
form data structures whose nodes are of different types, i.e. 
inhomogeneous data structures. The inhomogeneity is automatically (and 
most sensibly) bounded by the fact that the nodes are linked by 
pointers of a common base type.
Typically, the pointer fields establishing the structure are contained 
in the base type T, and the procedures manipulating the structure are 
defined in the same (base) module as T. Individual extensions 
(variants) are defined in client modules together with procedures 
operating on nodes of the extended type. This scheme is in full 
accordance with the notion of system extensibility: new modules 
defining new extensions may be added to a system without requiring a 
change of the base modules, not even their recompilation.
As access to an individual node via a pointer bound to a base type 
provides a projected view of the node data only, a facility to widen 
the view is necessary. It depends on the ability to determine the 
actual type of the referenced node. This is achieved by a type test, a 
Boolean expression of the form
t IS T'	(or p IS P')
If the test is affirmative, an assignment t':=t(t'of typeT') or p':= 
p(p'of type P') should be possible. The static view of types, however, 
prohibits this. Note that both assignments violate the rule of 
assignment compatibility. The desired assignment is made possible by 
providing a type guard of the form
t' := t(T') (p' := p(P'))
and by the same token access to the field z of a TO (see previous 
examples) is made possible by a type guard in the designator t (TO).z. 
Here the guard asserts that t is (currently) of type TO. In analogy to 
array bound checks and case selectors, a failing guard leads to program 
abortion.
Whereas a guard of the form t(T) asserts that t is of type T for the 
designator (starting with) t only, a regional type guard maintains the 
assertion over an entire sequence of statements. It has the form
WITH t: T DO StatementSequence END
and specifies that t is to be regarded as of type T within the entire 
statement sequence. Typically, T is an extension of the declared type 
of t. Note that assignments to t within the region therefore require 
the assigned value to be (an extension) of type T. The regional guard 
serves to reduce the number of guard evaluations.
As an example of the use of type tests and guards, consider the 
following types Node and Object defined in a module M:
TYPE Node = POINTER TO Object;
Object = RECORD key, x, y: INTEGER; left, right: Node END
Elements in a tree structure anchored in a variable called root (of 
type Node) are searched by the procedure element defined in M.
PROCEDURE element(k: INTEGER): Node;
    VAR p: Node;
BEGIN p:= root;
    WHILE (p # NIL) & (p.key # k) DO
    	IF p.key < k THEN p:= p.left ELSE p:= p.right END
    END;
    RETURN p
END element
Let extensions of the type Object be defined (together with their 
pointer types) in a module M 1 which is a client of M:
TYPE
    Rectangle =	POINTER TO RectObject;
          RectObject = 	RECORD (Object) w, h: REAL END;
    Circle =		POINTER TO CircleObject;
    CircleObject =	RECORD (Object) rad: REAL; shaded: BOOLEAN END
After the search of an element, the type test is used to discriminate 
between the different extensions, and the type guard to access 
extension fields. For example:
p:= M.element(K);
IF p # NIL THEN
    IF p IS Rectangle THEN ... p(Rectangle).w ...
    ELSIF (p IS Circle) & ~p(Circle) shaded THEN ... p(Gircle).rad ...
    ELSIF ...
The extensibility of a system rests upon the premise that new modules 
defining new extensions may be added without requiring adaptations nor 
even recompilation of the existing parts, although components of the 
new types are included in already existing data structures.
The type extension facility not only replaces Modula's variant records, 
but represents a typesafe alternative. Equally important is its effect 
of relating types in a type hierarchy. We compare, for example, the 
Modula types
TO' = RECORD t: T; z: REAL END;
TV = RECORD t: T; w: LONGREAL END
which refer to the definition of T given above, with the extended 
Oberon types TO and T1 defined above. First, the Oberon types refrain 
from introducing a new naming scope. Given a variable rO of type TO, we 
write rO.x instead of rO.t.x as in Modula. Second, the types T, TO', 
and T1' are distinct and unrelated. In contrast, TO and T1 are related 
to T as extensions. This becomes manifest through the type test, which 
asserts that variable rO is not only of type TO, but also of base type 
T.
The declaration of extended record types, the type test, and the type 
guard are the only additional features introduced in this context. A 
more extensive discussion is provided in [2]. The concept is very 
similar to the class notion of Simula 67 [3], Smalltalk [4], Object 
Pascal [5], C++ [6], and others, where the properties of the base class 
are said to be inherited by the derived classes. The class facility 
stipulates that all procedures applicable to objects of the class be 
defined together with the data definition. This dogma stems from the 
notion of abstract data type, but it is a serious obstacle in the 
development of large systems, where the possibility to add further 
procedures defined in additional modules is highly desirable. It is 
awkward to be obliged to redefine a class solely because a method 
(procedure) has been added or changed, particularly when this change 
requires a recompilation of the class definition and of all its client 
modules.
We emphasise that the type extension facility - although gaining its 
major role in connection with pointers to build heterogeneous, dynamic 
data structures as shown in the example above - also applies to 
statically declared objects used as variable parameters. Such objects 
are allocated in a workspace organized as a stack of procedure 
activation records, and therefore take advantage of an extremely 
efficient allocation and deallocation scheme.
In Oberon, procedure types rather than procedures (methods) are 
connected with objects in the program text. The binding of actual 
methods (specific procedures) to objects (instances) is delayed until 
the program is executed. The association of a procedure type with a 
data type occurs through the declaration of a record field. This field 
is given a procedure type. The association of a method - to use 
Smalltalk terminology - with an object occurs through the assignment of 
a specific procedure as value to the field, and not through a static 
declaration in the extended type's definition which then "overrides" 
the declaration given in the base type. Such a procedure is called a 
handler. Using type tests, the handler is capable of discriminating 
among different extensions of the record's (object's) base type. In 
Smalltalk, the compatibility rules between a class and its subclasses 
are confined to pointers, thereby intertwining the concepts of access 
method and data type in an undesirable way. In Oberon, the relationship 
between a type and its extensions is based on the established 
mathematical concept of projection.
In Modula, it is possible to declare a pointer type within an 
implementation module, and to export it as an opaque type by listing 
the same identifier in the corresponding definition module. The net 
effect is that the type is exported while all its properties remain 
hidden (invisible to clients). In Oberon, this facility is generalized 
in the sense that the selection of the record fields to be exported is 
arbitrary and includes the cases all and none. The collection of 
exported fields defines a partial view - a public projection - to 
clients.
In client modules as well as in the module itself, it is possible to 
define extensions of the base type (e.g. TextViewers or GraphViewers). 
Of importance is also the fact that non-exported components (fields) 
may have types that are not exported either. Hence, it is possible to 
hide certain data types effectively, although components of (opaquely) 
exported types refer to them.
-------------------

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Type extension