range checking

List overview All Threads
Download

newer

older

Packing strings ()

range checking code generation...

Peter N Lewis

9 Jul 2005 9 Jul '05

3:53 a.m.

Ok, I've got my application compiling and linking (all 250+ units).

Now I'm trying to test it out. Currently it is crashing out with range checks. The first was one I'm used to with GPC, passing an SInt32 to a UInt32 (often these types are used interchangeably by the system to refer to a generic block of four bytes (eg random number seed) and so ensuring they match is not normally that important, but GPC range checks on these. I've written SafeCast functions for Signed to Unsigned and vice versa. I guess I'll eventually catch them all.

The next one is more challenging, since I don't see a good workaround.

s: String(255)

@s[1] range checks if Length(s) is 0.

While this might technically be correct (the subscript is out of bounds), the actual address is valid.

This is quite challenging to work around, since I'm often forced to deal with strings as a pointer to chars and a length, and this means that code to say draw a string like:

DrawString( @s[1], Length(s) );

now needs to special case for s = ''.

I would contend that in the case of @s[x], x should be checked against a range of 1..Length(x)+1 because it should be legal to point to the next free character. Whether this is reasonably doable or not, I don't know.

If s were short strings, I could work around it with pointer arithmetic (@s+1), but given it is a Schema, I don't know how I would safely get the address of the first character of the string in the case where s might be empty.

More precise control over range checking ,might be helpful, options to disable range checking for strings, or perhaps limit them to capacity range checking, as well as disabling range checking on casts would be nice to get things going (I prefer to run with as much protection as possible generally, but sometimes range checking can be overly enthusiastic). Peter.

-- http://www.stairways.com/ http://download.stairways.com/

Show replies by date

Frank Heckenbach

9 Jul 9 Jul

4:08 a.m.

Peter N Lewis wrote:

...

The next one is more challenging, since I don't see a good workaround.

s: String(255)

@s[1] range checks if Length(s) is 0.

While this might technically be correct (the subscript is out of bounds), the actual address is valid.

This is quite challenging to work around, since I'm often forced to deal with strings as a pointer to chars and a length, and this means that code to say draw a string like:

DrawString( @s[1], Length(s) );

now needs to special case for s = ''.

Yes, I've had such cases in my code a few times. I've added the check. Though a bit more to write, it might even be more efficient (providing that for an empty string, it's really a no-op). GPC optimizes the comparison `s = ''' to a simple integer comparison (Length = 0), in contrast to a dummy procedure call.

...

I would contend that in the case of @s[x], x should be checked against a range of 1..Length(x)+1 because it should be legal to point to the next free character.

Always depends on what you do with it. If you want to write there, you can usually increase the length before rather than afterwards (I've also had this a few times and basically only needed to exchange two statements). Reading from this location is usually undefined/invalid. The "length 0 / index 1" case is a bit special, it can become valid more easily, but also often avoided more easily, see above ...

...

Whether this is reasonably doable or not, I don't know.

Not really too easy. (It's a bit hard to tell the compiler that `s[Length (s) + 1]' is invalid, but becomes valid again in a certain context. Basically we'd have to either undo (always ugly) this range-checking, or postpone all range-checking to a later stage, i.e. almost add another pass to the compiler.)

...

If s were short strings, I could work around it with pointer arithmetic (@s+1), but given it is a Schema, I don't know how I would safely get the address of the first character of the string in the case where s might be empty.

Instead of such ugly tricks, you can always locally turn off range-checking if nothing else helps:

{$local R-} ... {$endlocal}

...

More precise control over range checking ,might be helpful, options to disable range checking for strings, or perhaps limit them to capacity range checking,

Well, that's what `--borland-pascal' does (for compatibility, obviously, types.c:2293), so it wouldn't be hard to make it a separate option (yeah -- it's about time for another one ... naming suggestions welcome ...). BTW, how does Mac Pascal behave?

Frank

-- THANK YOU, EUROPEAN PARLIAMENT, for rejecting one of the most dangerous stupidities: http://swpat.ffii.org/log/05/ep0706/ Frank Heckenbach, frank@g-n-u.de, http://fjf.gnu.de/, 7977168E

Gale Paeper

10 Jul 10 Jul

6:36 a.m.

Peter N Lewis wrote:

...

[snip]

...

Now I'm trying to test it out. Currently it is crashing out with range checks. The first was one I'm used to with GPC, passing an SInt32 to a UInt32 (often these types are used interchangeably by the system to refer to a generic block of four bytes (eg random number seed) and so ensuring they match is not normally that important, but GPC range checks on these. ...

A few observations on Mac Pascal dialect compilers' range checking inconjunction with the Mac OS [X] Pascal interfaces unit MacTpes.p[.pas] declaration of the type pairs UInt16/SInt16 and UInt32/SInt32.

For THINK Pascal and MPW Pascal, there is no built-in support for unsigned whole number types. All whole number types are signed types; the built-in 16 bit 'integer' and 32 bit 'longint' are signed and are the only built-in whole number types. (The built-in 'comp' is a strange beast which lives in both the whole number world as a signed 64-bit whole number and [for operations] in the floating point world as a special case 80/96 bit extended IEEE 754 floating point type.)

For those two compilers, the type pair declarations are:

UInt16 = INTEGER; SInt16 = INTEGER; UInt32 = LONGINT; SInt32 = LONGINT;

Obviously, with those declarations, you're never going to get a range check error with any type pair (i.e., UInt16/SInt16 or UInt32/SInt32) mixed operation. You can freely interchange UInt16 and SInt16, or UInt32 and SInt32, without any range checking error worries.

CodeWarrior/Metrowerks Pascal, on the other hand, does have built-in support for unsigned whole number types as well as signed whole number types. The additional built-in unsigned whole number types are 8 bit 'unsignedbyte', 16 bit 'unsignedword', and 32 bit 'unsignedlong'. There is NO range checking for any operation involving those unsigned types.

For CodeWarrior/Metrowerks Pascal, the type pair declarations are:

UInt16 = UNSIGNEDWORD; SInt16 = INTEGER; UInt32 = UNSIGNEDLONG; SInt32 = LONGINT;

Since there is no range checking on the built-in unsigned types, obviously with those declarations you're never going to get a range check error with any type pair (i.e., UInt16/SInt16 or UInt32/SInt32) mixed operation and you get the same no range checking worries as one gets with THINK Pascal and MPW Pascal.

Thus, when one combines those facts, one sees that for Mac Pascal compiler code as long as one stays within the same type pair bit bucket (i.e., UInt16/SInt16, 16 bit, bit bucket or UInt32/SInt32, 32 bit, bit bucket) one never has to worry about range check errors. (Note: This does not necessarily mean all type pair operations will yield valid results - just that you don't get range check error traps.)

An additional factor with range checking and Mac Pascal compiler code is Apple's Mac OS [X] interfaces. The interfaces started out over two decades ago as primarily Mac Pascal based interfaces but are now C based interfaces. Over the decades and transition of the primary language base, some 16 bit buckets and 32 bit buckets being passed around through the interfaces have become "type blurred" with signed/unsigned type pair equivalancies (as well as a few other bit bucket pairings with Apple unique extensions and C synonym type equivalencies and integral type promotions). Although the "type blurring" wasn't completely free of problems with Mac Pascal compilers, it wasn't too bad with the compiler charateristics and MacTpes.p[.pas] type pair declaration combinations which effectively allowed for dealing with 16 and 32 bit, bit buckets without range check error worries. (One might need a judicious type coercion to supress type mismatch errors but even with that there were no range check worries. Note: I'm using Mac Pascal term "type coersion' on purpose here because there are some slight behavioral differences between Mac Pascal type coersion behavior and GPC's type cast behavior and those differences come into play with bit bucket type manipulation.)

Before range checking support was added to GPC, obviously there were no range checking problems when porting Mac Pascal code to GPC.

Now that GPC supports range checking, there is a boatload of new, false positive, range check problems to contend with with GPC ported Mac Pascal code.

I'm posting this information on Mac Pascal compiler range checking behavior on 16 and 32 bit, bit buckets in combination with Mac OS [X] interfaces "oddities" in the hope of giving those who haven't worked with the Mac Pascal compilers and Mac OS [X] interfaces combination a little better insight into the problems Peter, Adriaan, I, or any other Mac Pascal code porter are wressling with.

I think it worth mentioning that these differences in range checking on 16 and 32 bit, bit buckets also has some bearing on the outstanding univ parameter issue which Peter posted a good usage cases summary for. When one is dealing with OS interfaces with cases of built-in assumptions that the type pairs UInt16/SInt16 and UInt32/SInt32 are generic, repectively interchangeable 16 and 32 bit, bit buckets, range checked type casts aren't really a viable workaround solution.

Frank Heckenbach wrote:

...

Peter N Lewis wrote:

...
The next one is more challenging, since I don't see a good workaround.

s: String(255)

@s[1] range checks if Length(s) is 0.

[snip]

...

...
If s were short strings, I could work around it with pointer arithmetic (@s+1), but given it is a Schema, I don't know how I would safely get the address of the first character of the string in the case where s might be empty.

Instead of such ugly tricks, you can always locally turn off range-checking if nothing else helps:

{$local R-} ... {$endlocal}

...
More precise control over range checking ,might be helpful, options to disable range checking for strings, or perhaps limit them to capacity range checking,

Well, that's what `--borland-pascal' does (for compatibility, obviously, types.c:2293), so it wouldn't be hard to make it a separate option (yeah -- it's about time for another one ... naming suggestions welcome ...). BTW, how does Mac Pascal behave?

For string range checking, it depends upon which Mac Pascal dialect compiler you're using. If the compiler is THINK Pascal, string range checking uses the length of the string to determine whether the access is within a valid range. If the compiler is CodeWarrior/Metrowerks Pascal, range checking uses capacity of the string to determine whether the access is within a valid range.

If one is concerned the code is performing operations where the results are always valid within the total Pascal language context, then one wants THINK Pascal/GPC length based range checking.

If one is merely concerned the code stays within the bounds of allocated memory, then one probably wants something similar to CodeWarrior/Metrowerks Pascal capacity based checking.

Of course, there are times when one wants both, but that seems to me to be getting close to the realm of wanting a mind reading compiler that knows what one intended to do instead of what one explicitly told it to do.

As a personal, general range checking opinion, I don't want to see an exact duplication of CodeWarrior/Metrowerks Pascal range checking behavior supported in GPC. CodeWarrior/Metrowerks Pascal range checking just has too many holes (and bugs) in it too make it a very useful debugging aid. Although THINK Pascal range checking can be a pain in the rear sometimes, THINK Pascal's range checking support is at least useful for compiler and runtime assisted support for identifying questionably valid code constructs.

Gale Paeper gpaeper@empirenet.com

7313

Age (days ago)

7314

Last active (days ago)

gpc@gnu.de

2 comments

3 participants

tags (0)

participants (3)

Frank Heckenbach
Gale Paeper
Peter N Lewis