CBFalconer wrote:
Frank Heckenbach wrote:
I'm considering adding runtime checks for nil pointer access. The benefits are not as big as one might suppose, since most systems catch a nil pointer access anyway and give a segmentation fault automatically. Whereas explicit runtime checks might add considerable runtime overhead. Additionally, some valid (well, let's at least say, working) BP and similar programs might fail then, e.g. if they pass dereferenced pointers (which may be nil) as var parameters, and the called routine either doesn't access the parameter or explicitly checks if it's address is nil.
On the positive side, such a check might provide a somewhat nicer error message (proper runtime error instead of segfault), and it can be caught like other runtime errors (e.g. via `AtExit') rather than with a signal handler as for segfaults. Also, strict standard conformance makes it an error, even in a situation as described above where the var parameter is never used.
Since Pascal does not bandy pointers about, pointer checks can be quite accurate. However they have to be customized to the actual package. For example, using my nmalloc package for DJGPP agreement between a couple of internal linkages, and absence of the 'freed' marker, suffice. On an ancient CP/M system the check consisted of ensuring that the pointer lay within the current heap region, or (for thoroughness) following a linked list for membership. None of this applies to another installation. Once you have such a NIL check is trivially added, the only need being to control energization. If the pointer check is done during pointer variable assignment, it can be omitted during dereference, and the only dereference need is for NIL, and the NIL check is not needed during the assignment check.
It should be possible to turn this on and off by pseudo comments. Without this any creation of C style pointers will become impossible.
These might also be interesting points, but not actually what I plan ATM. Yes, we could relatively easily check if a pointer lies within the allocated area. But for some purposes that's too strict. E.g., one might want to operate on memory-mapped files or devices and use pointers in their area as well. And IMHO procedure pointers (which do not exist in standard Pascal) also have their use (callbacks, procedure tables, etc.), and would lie outside the heap area.
AFAICS, doing something as you describe while taking care of all these necessities might be quite tricky. Sure, there is some benefit.
IMHO dangling pointers are the more serious problem, since wild pointers (outside the heap) area can be avoided, e.g. by initializing all pointers to nil (though I don't recommend to do this always) and not doing other dirty stuff, while dangling pointers are more difficult to avoid "statically".
Your linked list suggestion would detect them (simple heap bounds checking would not). But it adds an O(n) factor in each pointer access which is often not acceptable. I'm not aware of other methods which don't either also add some runtime overhead (though one might get it down to O(ln n) by using a good data structure, perhaps a balanced tree) or produce a memory leak (such as by keeping track of all disposed pointers).
Well, while I'm writing this, it occurs to me, if we'd ever want to do such complex checking, it would probably be done in a subroutine anyway -- despite the call overhead, but inlining it for each pointer access could really explode program sizes. Given that, it might as well be a user routine which can be called via a procedure pointer. This way, you and others interested could experiment with various implementations and keep them independent of the compiler core, and environment/application-specific implementations are possible without creating a specially patched compiler.
Such a user-routine check should be easy to do in the compiler, and if there's interest, I can probably do this together with my plans. Of course, there should be a way for a simple inlined nil check without much runtime overhead, so we'd probably need a three-way option (or two options working together -- one to turn it on/off, and one to select the kind of checking; the latter would then usually be set globally, the former could be turned off locally for dirty code or for efficiency reasons).
Frank