Hi
I have downloaded Frank's cgiprogs package, for the purpose of building a cgi program for an adsl router (running MontaVista Linux on a mipsfple processor). First, hats off to Frank. The program compiled and runs fine on the router. The problem is the size of the compiled program (320kb when statically linked with libgpc.a, and 250kb when linked to libgpc.so, which itself is 320kb). This is a significant amount of space for a single program to take on a router.
My question (and probably only Frank can answer this) is, how can one reduce the size of this program? The program uses the gpc and cgi units only. From the GPC unit it calls "Execute" and "GetTempDirectory", and from the cgi unit it calls: GetVar IsCGI CGIInitVars
If it is not possible to reduce the program size, then I guess that would have to be it - I just thought I'd ask anyway :)
Perhaps there is a C cgi library?
Thanks.
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.greatchief.plus.com/
Prof. A Olowofoyeku (The African Chief) wrote:
I have downloaded Frank's cgiprogs package, for the purpose of building a cgi program for an adsl router (running MontaVista Linux on a mipsfple processor). First, hats off to Frank. The program compiled and runs fine on the router. The problem is the size of the compiled program (320kb when statically linked with libgpc.a, and 250kb when linked to libgpc.so, which itself is 320kb). This is a significant amount of space for a single program to take on a router.
My question (and probably only Frank can answer this) is, how can one reduce the size of this program? The program uses the gpc and cgi units only. From the GPC unit it calls "Execute" and "GetTempDirectory", and from the cgi unit it calls: GetVar IsCGI CGIInitVars
The basic problem is that gpc doesn't support smart-linking, which implies that unused routines end-up in the executable, runtime library routines included. So, the easiest way to reduce the size of the executable is to create a special version of libgpc, manually removing anything unused.
What also might help (to a certain extent) is to let the linker strip - all debug info - all symbol info The details may vary per platform (pass something like -Wl,-s).
Regards,
Adriaan van Os
Prof. A Olowofoyeku (The African Chief) wrote:
I have downloaded Frank's cgiprogs package, for the purpose of building a cgi program for an adsl router (running MontaVista Linux on a mipsfple processor). First, hats off to Frank. The program compiled and runs fine on the router.
Well, thanks. I've never tested the unit on such a machine, glad to know it works out of the box. :-)
The problem is the size of the compiled program (320kb when statically linked with libgpc.a, and 250kb when linked to libgpc.so, which itself is 320kb). This is a significant amount of space for a single program to take on a router.
Are you planning to have multiple programs? In this case, dynamic linking of libgpc.so (and possibly the CGI unit and the units it uses in turn) should be worthwile. Otherwise, of course, for a single program, dynamic linking doesn't gain anything filesize-wise.
Of course, "strip"ping the program (or using "-s") will save something, if you haven't already done so.
My question (and probably only Frank can answer this) is, how can one reduce the size of this program? The program uses the gpc and cgi units only. From the GPC unit it calls "Execute" and "GetTempDirectory",
Of course, it needs other parts of the RTS by implicit calls, directly or via the CGI unit (much of the string and file stuff, e.g.).
and from the cgi unit it calls: GetVar IsCGI CGIInitVars
You could rip out the parts of the source not needed, but I doubt this would really save much, since the biggest part of the size probably comes from the RTS anyway ...
As you might remember, we talked about "smart-linking" a log time ago. I haven't found the time to pursue it further or automate it, but you might try it manually.
You might still have the old mails around -- you mentioned a suggestion from someone from RedHat to use "--function-sections" when compiling (this would also apply to building the RTS), and objcopy with the options "--only-section" and/or "--strip-symbol".
Your last paragraph in that mail looked quite optimistic. If it works here as well (parts may be platform-specific, though I hope not), and if it's only a problem due to the number of symbols, I suppose I can help turning it into a shell script (for version 0.1; later it may become integrated in gp or such).
: The stuff with gpc/gcc '-ffunction-sections' and objcopy : '--only-section' works! I did it all wrong and wrote to the RedHat guy, : who sent me an example (see below). I can get it working correctly now. : Does this mean that there is now hope for smart linking (at least with : libgpc.a)?
Also "--gc-sections" might be worth looking at.
Perhaps there is a C cgi library?
I suppose there are quite a few actually, but I don't have any first-hand experience with of them, as I've written all my CGI programs in Pascal and bash. ;-)
Frank
On 21 Aug 2006 at 16:51, Frank Heckenbach wrote:
Prof. A Olowofoyeku (The African Chief) wrote:
I have downloaded Frank's cgiprogs package, for the purpose of building a cgi program for an adsl router (running MontaVista Linux on a mipsfple processor). First, hats off to Frank. The program compiled and runs fine on the router.
Well, thanks. I've never tested the unit on such a machine, glad to know it works out of the box. :-)
Well, to be honest, I didn't really know what to expect - but it turned out to be so straightforward :)
The problem is the size of the compiled program (320kb when statically linked with libgpc.a, and 250kb when linked to libgpc.so, which itself is 320kb). This is a significant amount of space for a single program to take on a router.
Are you planning to have multiple programs? In this case, dynamic linking of libgpc.so (and possibly the CGI unit and the units it uses in turn) should be worthwile. Otherwise, of course, for a single program, dynamic linking doesn't gain anything filesize-wise.
This was just an experiment to see if I could add somefunctionality to the router firmware that other routers didn't have, but which I have seen in Smoothwall. In the end all that was required was two small procedures. I would have probably wanted to add more programs - but the router has only limited free space in the flash memory, and so it is not feasible if all programs would be 100kb or more in size.
Of course, "strip"ping the program (or using "-s") will save something, if you haven't already done so.
Yes, I already did that - flags: "-Os -s".
My question (and probably only Frank can answer this) is, how can one reduce the size of this program? The program uses the gpc and cgi units only. From the GPC unit it calls "Execute" and "GetTempDirectory",
Of course, it needs other parts of the RTS by implicit calls, directly or via the CGI unit (much of the string and file stuff, e.g.).
Indeed.
and from the cgi unit it calls: GetVar IsCGI CGIInitVars
You could rip out the parts of the source not needed, but I doubt this would really save much, since the biggest part of the size probably comes from the RTS anyway ...
Precisely. But it is one thing worth trying.
As you might remember, we talked about "smart-linking" a log time ago. I haven't found the time to pursue it further or automate it, but you might try it manually.
You might still have the old mails around -- you mentioned a suggestion from someone from RedHat to use "--function-sections" when compiling (this would also apply to building the RTS), and objcopy with the options "--only-section" and/or "--strip-symbol".
Your last paragraph in that mail looked quite optimistic. If it works here as well (parts may be platform-specific, though I hope not), and if it's only a problem due to the number of symbols, I suppose I can help turning it into a shell script (for version 0.1; later it may become integrated in gp or such).
[...]
It was okay with trivial stuff - but when I tried it with some more complex code, it all went awry (sometimes ok and sometimes not).
[...]
Perhaps there is a C cgi library?
I suppose there are quite a few actually, but I don't have any first-hand experience with of them, as I've written all my CGI programs in Pascal and bash. ;-)
Yes, I started off trying to write a shell script (the shell is "ash"). What got me stumped was the equivalent of your "GetVar" function - I could not figure out how to get the variable being sent to the CGI program and its value - so in desperation, I decided to write a pascal program, even though I knew that size would be a problem. If you know how to do a GetVar in bash, that would be an immediate solution, since I could just go back to a shell script. But I would still want to use the GPC option for other things.
Thanks.
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.greatchief.plus.com/
On 21 Aug 2006 at 16:27, Prof. A Olowofoyeku (The African Chief) wrote: [...]
If you know how to do a GetVar in bash, that would be an immediate solution, since I could just go back to a shell script.
I have just had a look through cgi.pas and found "query_string". A search through the web indicates that this might be all I need for my shell script :)
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.greatchief.plus.com/
Prof. A Olowofoyeku (The African Chief) wrote:
The problem is the size of the compiled program (320kb when statically linked with libgpc.a, and 250kb when linked to libgpc.so, which itself is 320kb). This is a significant amount of space for a single program to take on a router.
Are you planning to have multiple programs? In this case, dynamic linking of libgpc.so (and possibly the CGI unit and the units it uses in turn) should be worthwile. Otherwise, of course, for a single program, dynamic linking doesn't gain anything filesize-wise.
This was just an experiment to see if I could add somefunctionality to the router firmware that other routers didn't have, but which I have seen in Smoothwall. In the end all that was required was two small procedures. I would have probably wanted to add more programs - but the router has only limited free space in the flash memory, and so it is not feasible if all programs would be 100kb or more in size.
With full dynamic linking (i.e., if you move the .o files of all non-RTS units into a .so library as well), the individual programs should be smaller, but of course, the one-time space requirement for this .so and libgpc.so would still be a few 100 KB. How much free space does it have, BTW?
I have just had a look through cgi.pas and found "query_string". A search through the web indicates that this might be all I need for my shell script :)
Basically yes, at least for GET requests, though you might have to do some parsing which isn't always nice to do in shell scripts -- especially if you need multiple variables or quoted characters (though most printable ASCII characters are usually not quoted).
Though a bit OT, I have some ugly (almost write-only ;-) bash code to do some parsing for simple cases. It allows only GET requests (POST without multipart could be added using dd ...), and allows only a limited set of characters (others can be added, but one has to be careful of special characters, as always in shell scripts; in particular quote characters would be problematic here), and only a few quoted characters (2F (/) and E4 (À), can be taken as examples to add what one needs). It does everything with bash-internals (no sed or such, for performance), but therefore it really requires bash. It puts the content of the CGI variable foo in the shell variable CGIVAR_foo. It tried to make it secure against malicious input, but of course, I can't guarantee anything.
#!/bin/bash
error () { echo "Content-type: text/plain" echo "" echo "Error: $1" exit }
tmp="&${QUERY_STRING##*?}" tmp="${tmp//[%]2[Ff]//}" tmp="${tmp//[%][Ee]4/À}" tmp="${tmp// /#}" tmp="${tmp//[+]/ }" if [ x"$REQUEST_METHOD" != x"GET" ] || [ x"${tmp//[=A-Za-z0-9_.&/ -]/}" != x ]; then error "invalid input" fi tmp="${tmp//[&=]/" "}" tmp="${tmp#"}"" eval tmp="($tmp)" a="" for b in "${tmp[@]}"; do if [ x"$a" = x ]; then a="${b:-dummy}" else [ x"${a//[A-Za-z0-9_]/}" = x ] && eval "CGIVAR_$a="$b"" a="" fi done
Frank
On 21 Aug 2006 at 18:13, Frank Heckenbach wrote:
Prof. A Olowofoyeku (The African Chief) wrote:
[...]
This was just an experiment to see if I could add somefunctionality to the router firmware that other routers didn't have, but which I have seen in Smoothwall. In the end all that was required was two small procedures. I would have probably wanted to add more programs
- but the router has only limited free space in the flash memory,
and so it is not feasible if all programs would be 100kb or more in size.
With full dynamic linking (i.e., if you move the .o files of all non-RTS units into a .so library as well), the individual programs should be smaller, but of course, the one-time space requirement for this .so and libgpc.so would still be a few 100 KB. How much free space does it have, BTW?
There are two versions of the router - one has 1mb free on the flash, and another has about 200kb free. The files stored on flash are in a squashfs 2.1 filesystem, so it just about fitted on the second version of the router.
However, see below.
I have just had a look through cgi.pas and found "query_string". A search through the web indicates that this might be all I need for my shell script :)
Basically yes, at least for GET requests, though you might have to do some parsing which isn't always nice to do in shell scripts -- especially if you need multiple variables or quoted characters (though most printable ASCII characters are usually not quoted).
[...]
Yes, I came across that problem. I eventually solved it by doing the parsing of the cgi GET request in C (called from the shell script with "eval").
The script you have included is very helpful, thanks. I am sure I can do other things with it.
Finally, I wrote the whole program in C (compiled to 9kb). For me, this change of language was a retrograde (but necessary) step, and what took me 15 minutes to accomplish in Pascal took hours to do in C (obviously my C is rubbish). This leads to my next few questions:
1. Is it possible to have reduced functionality versions of libgpc (perhaps produced with a switch when building the compiler?)? If so, is it possible to choose which features shall be built into it? (e.g., via a simple text configuration file, of the kind you have when building the linux kernel, or busybox, etc.). What I mean is something like (please note, this is off the top of my head, and is not properly thought through - it may even be impossible or the necessary features may not be in libgpc at all, but rather in the compiler):
# enable support for pascal strings STRINGS=y
# Pascal file I/O FILES=y
# Pascal objects OOP=y
etc., etc.
2. What things can be done in GPC without libgpc - for example, if one produced an include file of libc exports and doesn't use units or Pascal strings or objects or file I/O at all?
3. This follows from #2 - how can one write a different libgpc? Is there a special thing that has to be done to make it work (i.e., how is it different from any bog-standard .a or .so file?).
4. Would there be any mileage in producing a libc standard unit?
Thanks,
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.greatchief.plus.com/
Prof. A Olowofoyeku (The African Chief) wrote:
- Is it possible to have reduced functionality versions of libgpc
(perhaps produced with a switch when building the compiler?)? If so, is it possible to choose which features shall be built into it? (e.g., via a simple text configuration file, of the kind you have when building the linux kernel, or busybox, etc.).
If so, then configure options might be the obvious way to go.
What I mean is something like (please note, this is off the top of my head, and is not properly thought through - it may even be impossible or the necessary features may not be in libgpc at all, but rather in the compiler):
They're in both of them which doesn't help. I.e., if you built an RTS without string support, some operations that don't look like RTS calls (e.g. "+" for strings) would lead to undefined linker references.
One could add explicit checks in the compiler, but (slightly tangential to the topic) I'm thinking of a different route here: So far, the compiler creates RTS calls based on "magic" linker names ("_p_Set_Union" etc.) and makes implicit declarations for them which have to match the RTS declarations. Now we could instead properly import the RTS declarations from GPI files. (Previously, due to lack of qualified identifiers and selective import, this would have created namespace conflicts, but now that these features are available, it can be done.) The compiler would then call the RTS based on (still "magic") Pascal-level declarations, so when a version of the RTS omits them, the compiler will simply notice the absence of those declarations in the RTS GPI files and could emit somewhat clearer errors.
There would be some strange effects, though. E.g., comparing two strings requires an RTS call, but comparing one string against '' does not because it can be optimized to a comparison of its length against 0. Removing the respective RTS routine would mean that the latter would still work, but the former wouldn't. Of course, one could explicitly forbid the former as well when the RTS routine isn't found (not sure if rather useful or annoying).
# enable support for pascal strings STRINGS=y
# Pascal file I/O FILES=y
The problem is that parts of them depend on each other. E.g., most file, some string, and many more routines can generate runtime errors. Runtime error handling uses strings ... and files ... etc. ... So omitting either strings or files would be difficult, and removing both of them would mean replacing the runtime error management with a version that doen't use (Pascal) strings and files. So you quickly arrive at a very bare-bones RTS (which some people use for special cases, indeed, but e.g. the CGI unit would not find easy to use -- e.g., it obviously uses strings quite a lot, as well as files, for POST uploads, output buffering, runtime error mailing, etc.).
One idea I have in mind WRT the RTS is to reorganize its units, so we'd have a (clearly visible) core of routines that are interrelated and provide the basic support, and put this in the lowest-level RTS unit. This would include runtime errors, and the necessary amount of string and file capabilities to support them, whereas e.g. additional string and file features not strictly needed here would be one level higher.
In particular, this should get rid of cyclic dependencies in the RTS. Currently there are a few explicit ones, but many more implicit ones, via magic compiler calls. E.g., an RTS routine does a file operation, and the compiler translates it to a call of an RTS file routine that it just assumes exists, although it's in a unit that will be compiled later and probably uses the current unit. By implementing my above plan (Pascal-level imports), such dependencies would become visible, in this example forcing the RTS unit to import the respective RTS file routines, and thus (at first) create a lot of cyclic dependencies in the RTS. By resolving them manually (by reorganizing the RTS units), we'd get close to the unit structure I described.
In such a setting, one could ideally omit whole RTS units that are not needed. (But, of course, declaration-level smart-linking would still give somewhat besser results, so I still have it on my list ...)
- What things can be done in GPC without libgpc - for example, if
one produced an include file of libc exports and doesn't use units or Pascal strings or objects or file I/O at all?
Basically yes, though there isn't an "official" list of which internals require RTS calls (and this might change slightly over time), so one can only try, looking at the linker errors (undefined references).
There are a few routines that always must be provided as they're called by automatic initialization etc. This is the list I used in a small standalone project last year. (You can omit the range check error stuff if you disable range checking, of course, OTOH you might need other runtime errors when linker errors tell you so.) The RTS version (here, 20050331) has to be matched to the version of the RTS replaced, and the list of declarations and their parameters may change slightly with new GPC versions, so the code isn't exactly maintenance-free (the main reason for the requirement of the RTS version check).
var VersionCheck: Integer; attribute (name = '_p_GPC_RTS_VERSION_20050331');
procedure Initialize (ArgumentCount: CInteger; Arguments, StartEnvironment: PCStrings; Options: CInteger); attribute (name = '_p_initialize'); begin end;
procedure DoInitProc; attribute (name = '_p_DoInitProc'); begin end;
procedure Finalize; attribute (name = '_p_finalize'); begin end;
procedure CExit (Status: CInteger); external name 'exit';
procedure RangeCheckError; attribute (name = '_p_RangeCheckError'); begin CExit (42) end;
- This follows from #2 - how can one write a different libgpc? Is
there a special thing that has to be done to make it work (i.e., how is it different from any bog-standard .a or .so file?).
I hope the above answers this. For the most part, it isn't very special, except for the explicit linker names. (But when we change it as described, any RTS replacement also needs changing, of course, e.g. using magic Pascal names then. Also, the parameters of some RTS routines change occasionally, in accordance with compiler calling changes.)
As you can see, the C part of the RTS is quite small now (one file (rts.c, plus interfaces in rtsc.pas), and not fundamentally different from C code and interfaces called by other Pascal units (except that it uses more C headers and does many more portability conditionals, mostly using autoconf settings, than typically otherwise, due to its purpose of interfacing to different libc's).
The library building part (.a or .so) is nothing special in the RTS. You could link the list of .o files instead (manually) if you wanted.
gpc.pas is a bit "magical" in that it's generated by a script from the interfaces of the other units, excluding parts enclosed in "{@internal}" .. "{@endinternal}" comments. (These are just the parts with the magic linker names, more precisely those which are meant to be called only by the compiler, not from user code directly via gpc.pas. With my suggestion above, this would change, and the "{@internal}" comments probably disappear. gpc.pas could then probably switch to proper re-exporting instead of being script-generated.)
Another special thing is the units' name-attributes which are just there to avoid namespace conflicts with user-units of the same name (which are perfectly valid, of course, so they must not break).
You have to be careful of unintended recursion due to internal compiler calls. E.g. doing file I/O from a routine to implement file I/O is a recursion though it doesn't look like one ordinarily -- it might be OK if it's to a different file, and your routines (with according data structure) are reentrant, but in general you have to be careful there.
Also, during the initialization and finalization of the RTS, RTS services may not be available as expected, so you have to be very careful of the order of doing things. E.g., obviously before the memory manager is initialized, dynamic memory allocation won't work; this includes all uses of New and GetMem, of course, and also RTS routines that do them (which are under your control then, of course; e.g. in the current RTS some file routines).
Initialization is started via _p_initialize (see above) which has to call the RTS units' intializers (as needed). In the default RTS that's the strange "InitInit" call in init.pas which calls the implicit initializer of init.pas, which in turn calls all the other initializers automatically (in the regular way) as init.pas uses all the other units.
- Would there be any mileage in producing a libc standard unit?
libc is a rather vague term here. Such a unit could be anything from a non-portable interface of the 6 most important libc calls (open, close, read, write, fork, exec, according to Linus ;-) to a fully portable interface to all known libc's on this planet, with interface to all functions supported by any of them plus emulations/errors where not supported ...
The RTS's rts.c file (plus rtsc.pas interafaces) is somewhat closer to the latter extreme (though there are still many areas of libc not covered yet). It actually makes available many functions in "C style" (e.g. OpenHandle etc., visible in gpc.pas), besides being used in the RTS to implement the higher-level routines. So to some extent this is such a unit already.
I suppose you're more thinking of a rather minimal unit. A problem is that different programmers (and even different projects by the same programmer) will often disagree just how minimal it should be. In the end, a bigger unit may fare better when automatically removing the unused parts. Yes, I know, we need smart linking ...
Frank
On 23 Aug 2006 at 17:43, Frank Heckenbach wrote:
Prof. A Olowofoyeku (The African Chief) wrote:
- Is it possible to have reduced functionality versions of libgpc
(perhaps produced with a switch when building the compiler?)? If so, is it possible to choose which features shall be built into it? (e.g., via a simple text configuration file, of the kind you have when building the linux kernel, or busybox, etc.).
If so, then configure options might be the obvious way to go.
Indeed.
What I mean is something like (please note, this is off the top of my head, and is not properly thought through - it may even be impossible or the necessary features may not be in libgpc at all, but rather in the compiler):
They're in both of them which doesn't help. I.e., if you built an RTS without string support, some operations that don't look like RTS calls (e.g. "+" for strings) would lead to undefined linker references.
Yes, I thought so.
One could add explicit checks in the compiler, but (slightly tangential to the topic) I'm thinking of a different route here: So far, the compiler creates RTS calls based on "magic" linker names ("_p_Set_Union" etc.) and makes implicit declarations for them which have to match the RTS declarations. Now we could instead properly import the RTS declarations from GPI files. (Previously, due to lack of qualified identifiers and selective import, this would have created namespace conflicts, but now that these features are available, it can be done.) The compiler would then call the RTS based on (still "magic") Pascal-level declarations, so when a version of the RTS omits them, the compiler will simply notice the absence of those declarations in the RTS GPI files and could emit somewhat clearer errors.
Sounds good to me.
There would be some strange effects, though. E.g., comparing two strings requires an RTS call, but comparing one string against '' does not because it can be optimized to a comparison of its length against 0. Removing the respective RTS routine would mean that the latter would still work, but the former wouldn't. Of course, one could explicitly forbid the former as well when the RTS routine isn't found (not sure if rather useful or annoying).
Either is fine.
[...]
One idea I have in mind WRT the RTS is to reorganize its units, so we'd have a (clearly visible) core of routines that are interrelated and provide the basic support, and put this in the lowest-level RTS unit. This would include runtime errors, and the necessary amount of string and file capabilities to support them, whereas e.g. additional string and file features not strictly needed here would be one level higher.
This would be excellent.
[...]
- What things can be done in GPC without libgpc - for example, if
one produced an include file of libc exports and doesn't use units or Pascal strings or objects or file I/O at all?
Basically yes, though there isn't an "official" list of which internals require RTS calls (and this might change slightly over time), so one can only try, looking at the linker errors (undefined references).
Producing an empty libgpc.a to compile a simple "begin end." program produced 4 linker errors - rts version, initialize, finalize, and something else which I forget. I wrote simple stubs - functions returning 0 with those names and put them in libgpc.a. That sorted out those problems. I wrote a small include file with a few libc routines (malloc, free, puts, gets, system, printf, functions from string.h, etc) and wrote a little test program to test various things. The program compiled to between 5kb and 9kb, and I was able to read and write strings with it, get and set environment variables, and stuff. My little CGI program doesn't really need more than these routines. But the ensuing program would look more like C than Pascal.
I will have a look again at rts.c.
What would be helpful is a way to specify a particular "libgpc" to the compiler at compile time - e.g., "gpc foo.pas --libgpc=small_libgpc"
So, instead "-lgpc", the compiler will get "-lsmall_libgpc". Can this be done at the moment?
- Would there be any mileage in producing a libc standard unit?
libc is a rather vague term here. Such a unit could be anything from a non-portable interface of the 6 most important libc calls (open, close, read, write, fork, exec, according to Linus ;-) to a fully portable interface to all known libc's on this planet, with interface to all functions supported by any of them plus emulations/errors where not supported ...
"Standard" GNU libc, if such a thing exists.
The RTS's rts.c file (plus rtsc.pas interafaces) is somewhat closer to the latter extreme (though there are still many areas of libc not covered yet). It actually makes available many functions in "C style" (e.g. OpenHandle etc., visible in gpc.pas), besides being used in the RTS to implement the higher-level routines. So to some extent this is such a unit already.
Great. I will have a look at it.
I suppose you're more thinking of a rather minimal unit. A problem is that different programmers (and even different projects by the same programmer) will often disagree just how minimal it should be.
Indeed. Which is why I wondered whether the contents of it could be set by configuration entries.
In the end, a bigger unit may fare better when automatically removing the unused parts. Yes, I know, we need smart linking ...
Indeed. Until now, I hadn't seen the urgency. Now that I know for sure that GPC can be used in embedded system programming, the problem of the size of the generated binaries has become more visible.
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.greatchief.plus.com/
Prof. A Olowofoyeku (The African Chief) wrote:
[...]
- What things can be done in GPC without libgpc - for example, if
one produced an include file of libc exports and doesn't use units or Pascal strings or objects or file I/O at all?
Basically yes, though there isn't an "official" list of which internals require RTS calls (and this might change slightly over time), so one can only try, looking at the linker errors (undefined references).
Producing an empty libgpc.a to compile a simple "begin end." program produced 4 linker errors - rts version, initialize, finalize, and something else which I forget. I wrote simple stubs - functions returning 0 with those names and put them in libgpc.a.
That's just what I put in my last mail, with the (currently) correct declarations. Using nop-functions instead of procedures, and omitting parameters will work, of course, due to the default "C" calling convention. And a function instead of the variable works just because the compiler doesn't really do anything with the variable, just check for its presence (via the linker).
That sorted out those problems. I wrote a small include file with a few libc routines (malloc, free, puts, gets, system, printf, functions from string.h, etc) and wrote a little test program to test various things. The program compiled to between 5kb and 9kb, and I was able to read and write strings with it, get and set environment variables, and stuff. My little CGI program doesn't really need more than these routines. But the ensuing program would look more like C than Pascal.
Quite typical in such a setting. At least there's the possibility to expand its features when needed ...
What would be helpful is a way to specify a particular "libgpc" to the compiler at compile time - e.g., "gpc foo.pas --libgpc=small_libgpc"
So, instead "-lgpc", the compiler will get "-lsmall_libgpc". Can this be done at the moment?
Not at the moment, the name is hard-coded in gpc.c:
outfiles [i] = "-lgpc";
In principle I agree that making it configurable might be desirable. One thing to consider, though, is that after dropping automake, this adding of -lgpc and -lm will be one of the very few things left that gpc.c has to do different from gcc.c. So before we change anything there, we should probably look at GCC, e.g. whether gcc-4 or future versions may have a mechanism to automatically add such libraries (as other languages have to do the same with their runtime libs), and if so perhaps already a defined way of overriding them, so we might not need a separate gpc.c then at all. Just to avoid implementing something separate and perhaps incompatible. ATM I don't know if GCC has any such plans (or did anything already), perhaps Waldek knows.
Currently, a work-around is to produce .o files (manually or using gp), and link them with gcc (matching version, of course), adding the library explicitly.
It might also work to just add the new library, and just let gpc.c pass -lgpc. If your library comes first, its declarations should take precedence, and if no declaration from libgpc.a is used it's simply ignored by the linker.
Actually, one doesn't need a library for that. A plain unit using the appropriate linker names (or just embedding the code in the main program, as I just tried) will do. See the attached typescript.
The first example shows what happens when some of the linker names don't match -- though the messages are a bit surprising at first glance. The linker will pull the missing declarations from libgpc.a, thereby linking the library, and thus get duplicate symbols for the other (matching) linker names. So, this is a kind of warning against inadvertently using libgpc.a (though the executable size will tell you anyway).
The second example, matching the RTS version check (to my current GPC development number, yours will be different) works and produces a small exectuable. For comparison, the third example, compiling an empty file. (So removing code makes the executable bigger, isn't it funny? ;-)
- Would there be any mileage in producing a libc standard unit?
libc is a rather vague term here. Such a unit could be anything from a non-portable interface of the 6 most important libc calls (open, close, read, write, fork, exec, according to Linus ;-) to a fully portable interface to all known libc's on this planet, with interface to all functions supported by any of them plus emulations/errors where not supported ...
"Standard" GNU libc, if such a thing exists.
Actually I'm not sure if GNU libc on different platforms provides exactly the same features, or some non-portable ones as well, though one could omit the latter here if they exist.
That's quite a large beast already, and as I said rts.c only covers a part of it yet. If you like to extend it, go ahead, but of course, for inclusion into the RTS (whether the same or a new unit) we can't support GNU libc only, which means adding a lot of autoconf checks (which is actually sometimes the most work when adding a new function there).
Frank
On 24 Aug 2006 at 21:10, Frank Heckenbach wrote:
[...]
It might also work to just add the new library, and just let gpc.c pass -lgpc. If your library comes first, its declarations should take precedence, and if no declaration from libgpc.a is used it's simply ignored by the linker.
Actually, one doesn't need a library for that. A plain unit using the appropriate linker names (or just embedding the code in the main program, as I just tried) will do. See the attached typescript.
The first example shows what happens when some of the linker names don't match -- though the messages are a bit surprising at first glance. The linker will pull the missing declarations from libgpc.a, thereby linking the library, and thus get duplicate symbols for the other (matching) linker names. So, this is a kind of warning against inadvertently using libgpc.a (though the executable size will tell you anyway).
The second example, matching the RTS version check (to my current GPC development number, yours will be different) works and produces a small exectuable. For comparison, the third example, compiling an empty file. (So removing code makes the executable bigger, isn't it funny? ;-)
Indeed. But this is a brilliant idea. I will give it a go.
- Would there be any mileage in producing a libc standard unit?
libc is a rather vague term here. Such a unit could be anything from a non-portable interface of the 6 most important libc calls (open, close, read, write, fork, exec, according to Linus ;-) to a fully portable interface to all known libc's on this planet, with interface to all functions supported by any of them plus emulations/errors where not supported ...
"Standard" GNU libc, if such a thing exists.
Actually I'm not sure if GNU libc on different platforms provides exactly the same features, or some non-portable ones as well, though one could omit the latter here if they exist.
That's quite a large beast already, and as I said rts.c only covers a part of it yet. If you like to extend it, go ahead, but of course, for inclusion into the RTS (whether the same or a new unit) we can't support GNU libc only, which means adding a lot of autoconf checks (which is actually sometimes the most work when adding a new function there).
I think rts.c would be excellent, if it could be made entirely self- contained. I ran a "configure" on the rts directory and then built rts.c into rts.o. However, attempting to link it produced errors (because there were declarations from error.pas and some other .pas files). Is it possible to have a fully self-contained rts.c ?
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.greatchief.plus.com/
Prof. A Olowofoyeku (The African Chief) wrote:
I think rts.c would be excellent, if it could be made entirely self- contained. I ran a "configure" on the rts directory and then built rts.c into rts.o. However, attempting to link it produced errors (because there were declarations from error.pas and some other .pas files). Is it possible to have a fully self-contained rts.c ?
Perhaps not fully, but somewhat more. Some things could be mapped to libc equivalents (e.g., some CString routines -- I currently use libc only where strictly needed, i.e. basically OS interfaces, which CString routines don't need, but one could use libc routines as well, perhaps via an optional define), or the heap management (_p_New etc. -- clearly we want to use the Pascal heap manager from within the RTS, but for a standalone version, they can simply be mapped to malloc() etc.). I plan to move most of realpath() to the Pascal side anyway and leave just the basic OS interface (readlink() etc.) in rts.c, which will also get rid of some dependencies. But a few will remain -- some can be nop'ped (such as _p_SetReturnAddress) when not needed, but some have to be provided and should not really be nops (such as _p_RuntimeError, for obvious reasons, I suppose ;-). So you need to provide simple substitutes at least (whether written in C or Pascal -- cf. RangeCheckError in my previous example).
Frank
On 29 Aug 2006 at 10:39, Frank Heckenbach wrote:
Prof. A Olowofoyeku (The African Chief) wrote:
I think rts.c would be excellent, if it could be made entirely self- contained. I ran a "configure" on the rts directory and then built rts.c into rts.o. However, attempting to link it produced errors (because there were declarations from error.pas and some other .pas files). Is it possible to have a fully self-contained rts.c ?
Perhaps not fully, but somewhat more.
[....]
Yes, that would be very helpful indeed. Thanks.
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.greatchief.plus.com/