Re: Buffer Overrun Prevention in GPC

26 Jan 2006

      Mirsad Todorovac wrote:
...
On Thu, 26 Jan 2006, Frank Heckenbach wrote:
...
Adriaan van Os wrote:
...
...
Is Pascal and namely GNU Pascal safer re: buffer overruns?
How much does runtime range checking help
See http://www.gnu-pascal.de/crystal/gpc/en/mail12961.html.
Thank you for this link, Adriaan,
(Of course, the answer of Mr. Waldek Hebisch is depending on entire
Did he reply by private mail? I don't see his mail in the archive.
...
run/time system being (re)compiled with range checking on IMHO. Otherwise
a runtime overrun could happen inside a system function when unchecked?)
Yes. But by default, the RTS is compiled with range-checking on.
(cecause this is GPC's default in general, so unless you disable
checking by a make option for the RTS).
...
Additionally I am interested whether there is a plan to introduce NX bit 
support (or emulation), and about StackGuard or similar "canary" 
protection.
I suppose this means non-executable, for stack memory?
...
Does NX bit use depend on compiler support at all, or is it completelly
OS-supported/dependent? I reckon there ought to be something in RTS 
mmap()-ing stack area with PROT_EXEC disabled or am I wrong?
The stack is allocated (intially and grow) by the OS automatically,
so it's an OS issue. However, actually GPC is affected by it (more
precisely, those GCC backends that implement trampolines on the
stack, with those frontends (languages) that have local routines and
pointers/references to routines which includes, of course, Pascal).
There's actually code in several backends to disable PROT_EXEC for
stack areas where needed, as a work-around.
...
...
Yes. In particular, some holes are intentional, either for
compatibility with other dialects (pointer-arithmetic etc.), or for
performance reasons (possibility to turn off checks).
There are other holes, such as dangling pointers and using
unintialized variables, which GPC cannot detect at all yet. It might
do in the future, but implementing them will be very hard, so don't
hold your breath.
Thank you for your reply, Frank,
In fact, this came to my mind in few considerations too: I was thinking of 
new breed of pointers that would allow for memory map defragmentation. As 
said somewhere in theoretical books, after millions of seemingly random 
allocations/deallocations - memory looks completely fragmented, and wtih 
equally distributed holes and used spaces.
I'm no expert in this area, but I'd think this basically assumes a
dumb memory manager. A MM that distributes pages into chunks of
fixed size (say, powers of two, up to a certain limit) should do
much better, and that's what all modern MMs do, AFAIK. Above the
page size, fragementation doesn't matter much anyway, as it only
fragements the virtual mapping, i.e. holes can be unmapped.
Also, I'm not sure the situation is really typical. Often a big
number of allocations is deallocated in onw go (say, a list).
...
This is why I would like to propose "registered pointers" (maybe it came
subconsciously from Java?) and memory allocation library that would have a 
better heap use and corruption detection.
Registered pointers are close in semantics to pointers with ranges already 
intorduced on the list, but they should also allow defragmentation of the 
memory on-the-fly, compacting of free space and deallocating and returning 
ampped pages back to system (unlike current memory allocators that behave 
like my country's budget: they only grow and become less efficiently used 
with time ...).
Is that too difficult to implement?
Probably yes, because it affects everything. The code generation is
different, and all library routines must be recompiled this way.
Since this is not realistically possible for libc and other default
libraries used, this means you need a lot of wrappers. And when such
libraries can do de-/allocation themselves, this will also pose some
interesting problems.
And think about performance. Double indirection for every pointer
access doesn't really speed up the program. ;-)
...
(I am sorry if this has been discussed already.)
To explain the idea more visibly, as we are all probably tired at 4 PM:
 Every pointer would be registered in a pointer table or list
 *depending on implementation*

 Pointers would be indirect, used through a mechanism that would allow
 transparent moving of memory fragment pointed to.

 A garbage collection mechanism would browse through all pointers and
 defragment memory, much like a file system, moving used memory towards
 beginning, free memory towards end of memory mapping, so it could be
 later munmap()-ed and returned to OS when certain LOW_WATERMARK or
 threshold is reached

 Unused fragments of memory that are not explicitly free()-ed could be
 deallocated automatically and an optional warning could be generated

The latter is possible already, e.g. using the Boehm-Demers-Weiser
conservative garbage collector, which can be plugged transparently
into any libc-allocation based program (including GPC programs).
...
 Better heap use would allow for introducing "heap canary" technique
 that would eliminate another source of exploits

According to http://en.wikipedia.org/wiki/Stack_frame, canaries are
there to detect overflows in the first place.
It also describes a way to prevent such attacks to the return
address on the stack, by detecting them before the return
instruction (which may not be 100% foolproof, with longjmp etc., but
rather reliable), but I don't see how this directly translates to
the heap. Of course, you could check a canary before *every* access
to some heap allocated variable, but that would be quite some
overhead (perhaps only useful for debugging), and probably still not
easy to implement -- if you have, say, a var parameter or a pointer
parameter, how does the routine know if it has a canary (allocated
with one), or not (allocated from foreign code, or global or local
variables not on the heap).
Apart from that, allocating with a canary seems to be independent of
indirect pointers or compiler internals, so you could probably do it
by writing your own allocation routines. You'd only have to arrage
for the checking to happen an strategic places or such ...
As for (mostly) debugging checks, there's also libefence which you
probably know (which also works by overloading the de-/allocation
routines without any compiler support).
...
 All could be done transparent, without changing existing code

You mean source code? (Otherwise I don't understand what you mean by
indirect pointers.) But changing object code is not easy when it
necessarily includes *all* libraries used. (Indirect pointers change
the interface, so you can't just recompile those libraries you want
to have extra chekcing in.)
...
That's the general idea, but I haven't tried implementation yet.
Compatible programs could simply be linked with new library and have the
advantage of better memory use, better heap integrity and possibly longer 
uptime without need to reboot system because of memory leaks because of 
fragmentation.
If that's the case (and I misunderstood your indirect pointers), go
ahead and implement it. You can hook malloc/free (see libefence or
the BDW collector as examples to start from). In fact, with dynamic
libraries, you might not even have to relink programs, if you use
LD_PRELOAD. And it would be independent of languages, as long as
they use malloc/free (which GPC does internally by default).
...
I went far from original question, but this idea is tempting me for a long 
time, and GPC development team seems open enough :-) that I thought of 
requesting implementation or putting it on the wish list.
IMHO, I see it rather as a separate project. GPC is not really short
of features (existing and wishlist), and due to the considerations
above, it seems easily separable (unless you really want to add
compiler-supported checking which you don't seem to want, according
to the previous paragraph).
Frank
-- 
Frank Heckenbach, frank@g-n-u.de, http://fjf.gnu.de/, 7977168E
GPC To-Do list, latest features, fixed bugs:
http://www.gnu-pascal.de/todo.html
GPC download signing key: ACB3 79B2 7EB2 B7A7 EFDE  D101 CD02 4C9D 0FE0 E5E8

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: Buffer Overrun Prevention in GPC