Mirsad Todorovac wrote:
On Thu, 26 Jan 2006, Frank Heckenbach wrote:
Adriaan van Os wrote:
Is Pascal and namely GNU Pascal safer re: buffer overruns? How much does runtime range checking help
Thank you for this link, Adriaan,
(Of course, the answer of Mr. Waldek Hebisch is depending on entire
Did he reply by private mail? I don't see his mail in the archive.
run/time system being (re)compiled with range checking on IMHO. Otherwise a runtime overrun could happen inside a system function when unchecked?)
Yes. But by default, the RTS is compiled with range-checking on. (cecause this is GPC's default in general, so unless you disable checking by a make option for the RTS).
Additionally I am interested whether there is a plan to introduce NX bit support (or emulation), and about StackGuard or similar "canary" protection.
I suppose this means non-executable, for stack memory?
Does NX bit use depend on compiler support at all, or is it completelly OS-supported/dependent? I reckon there ought to be something in RTS mmap()-ing stack area with PROT_EXEC disabled or am I wrong?
The stack is allocated (intially and grow) by the OS automatically, so it's an OS issue. However, actually GPC is affected by it (more precisely, those GCC backends that implement trampolines on the stack, with those frontends (languages) that have local routines and pointers/references to routines which includes, of course, Pascal). There's actually code in several backends to disable PROT_EXEC for stack areas where needed, as a work-around.
Yes. In particular, some holes are intentional, either for compatibility with other dialects (pointer-arithmetic etc.), or for performance reasons (possibility to turn off checks).
There are other holes, such as dangling pointers and using unintialized variables, which GPC cannot detect at all yet. It might do in the future, but implementing them will be very hard, so don't hold your breath.
Thank you for your reply, Frank,
In fact, this came to my mind in few considerations too: I was thinking of new breed of pointers that would allow for memory map defragmentation. As said somewhere in theoretical books, after millions of seemingly random allocations/deallocations - memory looks completely fragmented, and wtih equally distributed holes and used spaces.
I'm no expert in this area, but I'd think this basically assumes a dumb memory manager. A MM that distributes pages into chunks of fixed size (say, powers of two, up to a certain limit) should do much better, and that's what all modern MMs do, AFAIK. Above the page size, fragementation doesn't matter much anyway, as it only fragements the virtual mapping, i.e. holes can be unmapped.
Also, I'm not sure the situation is really typical. Often a big number of allocations is deallocated in onw go (say, a list).
This is why I would like to propose "registered pointers" (maybe it came subconsciously from Java?) and memory allocation library that would have a better heap use and corruption detection.
Registered pointers are close in semantics to pointers with ranges already intorduced on the list, but they should also allow defragmentation of the memory on-the-fly, compacting of free space and deallocating and returning ampped pages back to system (unlike current memory allocators that behave like my country's budget: they only grow and become less efficiently used with time ...).
Is that too difficult to implement?
Probably yes, because it affects everything. The code generation is different, and all library routines must be recompiled this way. Since this is not realistically possible for libc and other default libraries used, this means you need a lot of wrappers. And when such libraries can do de-/allocation themselves, this will also pose some interesting problems.
And think about performance. Double indirection for every pointer access doesn't really speed up the program. ;-)
(I am sorry if this has been discussed already.)
To explain the idea more visibly, as we are all probably tired at 4 PM:
Every pointer would be registered in a pointer table or list *depending on implementation* Pointers would be indirect, used through a mechanism that would allow transparent moving of memory fragment pointed to. A garbage collection mechanism would browse through all pointers and defragment memory, much like a file system, moving used memory towards beginning, free memory towards end of memory mapping, so it could be later munmap()-ed and returned to OS when certain LOW_WATERMARK or threshold is reached Unused fragments of memory that are not explicitly free()-ed could be deallocated automatically and an optional warning could be generated
The latter is possible already, e.g. using the Boehm-Demers-Weiser conservative garbage collector, which can be plugged transparently into any libc-allocation based program (including GPC programs).
Better heap use would allow for introducing "heap canary" technique that would eliminate another source of exploits
According to http://en.wikipedia.org/wiki/Stack_frame, canaries are there to detect overflows in the first place.
It also describes a way to prevent such attacks to the return address on the stack, by detecting them before the return instruction (which may not be 100% foolproof, with longjmp etc., but rather reliable), but I don't see how this directly translates to the heap. Of course, you could check a canary before *every* access to some heap allocated variable, but that would be quite some overhead (perhaps only useful for debugging), and probably still not easy to implement -- if you have, say, a var parameter or a pointer parameter, how does the routine know if it has a canary (allocated with one), or not (allocated from foreign code, or global or local variables not on the heap).
Apart from that, allocating with a canary seems to be independent of indirect pointers or compiler internals, so you could probably do it by writing your own allocation routines. You'd only have to arrage for the checking to happen an strategic places or such ...
As for (mostly) debugging checks, there's also libefence which you probably know (which also works by overloading the de-/allocation routines without any compiler support).
All could be done transparent, without changing existing code
You mean source code? (Otherwise I don't understand what you mean by indirect pointers.) But changing object code is not easy when it necessarily includes *all* libraries used. (Indirect pointers change the interface, so you can't just recompile those libraries you want to have extra chekcing in.)
That's the general idea, but I haven't tried implementation yet.
Compatible programs could simply be linked with new library and have the advantage of better memory use, better heap integrity and possibly longer uptime without need to reboot system because of memory leaks because of fragmentation.
If that's the case (and I misunderstood your indirect pointers), go ahead and implement it. You can hook malloc/free (see libefence or the BDW collector as examples to start from). In fact, with dynamic libraries, you might not even have to relink programs, if you use LD_PRELOAD. And it would be independent of languages, as long as they use malloc/free (which GPC does internally by default).
I went far from original question, but this idea is tempting me for a long time, and GPC development team seems open enough :-) that I thought of requesting implementation or putting it on the wish list.
IMHO, I see it rather as a separate project. GPC is not really short of features (existing and wishlist), and due to the considerations above, it seems easily separable (unless you really want to add compiler-supported checking which you don't seem to want, according to the previous paragraph).
Frank