Thanks Waldek,
That's a really big help. Having migrated from DEC Pascal which had very different MODULE integration syntax I had never picked up on the importance of the IMPORT clause, especially as the code has always compiled without a hitch. I wonder if there is any way gpc can pick up on the use of declarations which haven't been properly initialised, either at compile time or run-time?
Regards, David Wood.
-----Original Message----- From: Waldek Hebisch [mailto:hebisch@math.uni.wroc.pl] Sent: Tuesday, September 16, 2003 12:44 PM To: Wood David Cc: gpc@gnu.de Subject: Re: Module global string bug
David Wood wrote:
Frank,
I don't know if there is a bug number for this but I am
attaching two really
small bits of code which show it up. It's the only
'genuine' bug I have
ever really come across in gpc. Basically, the globally
defined string
doesn't seem to get any length so whenever something is
assigned to it, the
contents vanish.
Your program is buggy: you forgot to 'import' your module. Even if you want to access objects in the module via external name you still need to 'import' it so its inititializer is run automatically. If you want main program to be in other language, then you need to call 'init_With_string' explicitly.
-- Waldek Hebisch hebisch@math.uni.wroc.pl or hebisch@hera.math.uni.wroc.pl
The Information contained in this E-Mail and any subsequent correspondence is private and is intended solely for the intended recipient(s). For those other than the recipient any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on such information is prohibited and may be unlawful.
Emails and other electronic communication with QinetiQ may be monitored. Calls to QinetiQ may be recorded for quality control, regulatory and monitoring purposes.
David Wood wrote:
Thanks Waldek,
That's a really big help. Having migrated from DEC Pascal which had very different MODULE integration syntax I had never picked up on the importance of the IMPORT clause, especially as the code has always compiled without a hitch. I wonder if there is any way gpc can pick up on the use of declarations which haven't been properly initialised, either at compile time or run-time?
Detecting uninitialised variables is hard problem. For externals there is nothing gpc can do: gpc sees only one module at any given moment and initialisation can be in a different module. For local variables there is '-Wuninitialized' option, but it is not fool-proof. AFAIK it depends on optimiser and will miss variables which are not intersting for optimisation. It will also flag some variables which are initialised, but in non-obvious way (say inside conditional).
Currently uninitialised variables gets what happen to lie in memory -- on Unix kernel fills memory with zeros at program start, so global variables get binary zero value. That catches nil pointers. One can imagine more "nasty" values (Not a numbers for floating point, pseudo-random for integers) but nothing of such sort is implemented.
gpc will do some initialisatins automaticaly. If that does not work, it is a compiler bug. And it is easier to fix such bug then to introduce extra checks.
Wood David wrote:
Thanks Waldek,
That's a really big help. Having migrated from DEC Pascal which had very different MODULE integration syntax I had never picked up on the importance of the IMPORT clause, especially as the code has always compiled without a hitch.
So to say by accident (just throwing things together at link time). Missing initializers are just one symptom of the problems. (In fact, GPC did use a different mechanism for calling initializers until some time ago. This would work independently of imports, but rely on some more or less obscure linker internals which would work on some systems and not on others (which meant that GPC on those latter systems was more or less unusable). So I think it's better to have one way (and in particular, one EP/BP (import/uses) compatible way) working on all systems than an undocumented way (bypassing imports) working only on some.
I wonder if there is any way gpc can pick up on the use of declarations which haven't been properly initialised, either at compile time or run-time?
As Waldek said, at compile time it's nearly impossible for global (especially cross-module) stuff. At runtime it's possible, but with huge effort AFAICT. (One might have to stick a flag to each variable or record/array field to be sure, which by the way would break binary compatibility to C and other languages etc. Not something I plan to do in the nearer future ...)
David Wood wrote:
Frank,
I don't know if there is a bug number for this but I am
attaching two really
small bits of code which show it up. It's the only
'genuine' bug I have
ever really come across in gpc. Basically, the globally
defined string
doesn't seem to get any length so whenever something is
assigned to it, the
contents vanish.
Your program is buggy: you forgot to 'import' your module. Even if you want to access objects in the module via external name you still need to 'import' it so its inititializer is run automatically. If you want main program to be in other language, then you need to call 'init_With_string' explicitly.
GPC now also allows `--init-modules=With_String' on the command-line (when compiling the main program). If the main program is not in Pascal, you have to do more calling, see demos/gpc_c_*
Frank
Morning.
Several months ago someone on this list mentioned that GPC was slower than other Pascal compilers.
Would like to comment on one possible cause for this. Please keep in mind than am not familiar with GPC internals. (i.e. Ignorance has never stopped opinions in the past:)
One possible cause for this lack of speed may be due to string handling. The schema used for strings has an allocated length (max allowed) but does not seem to have an actual length. In C derived languages the famous ASCII 0 is used as terminator.
If an actual length is not kept, then adding one may improve performance. This would imply that the string manipulation routines in the libraries would also have to be changed.
Past experience in C has shown that finding ASCII 0 in strings can take 15% to 25% of CPU time, depending on type of manipulations used.
Another trick that could improve string performance is reference counting.
This technique keeps a count of the number of string variable that reference a unique string. For example:
A := 'Kangaroos have nice tails';
The string schema would contain the allotted length, the actual string length and a reference of 1. The one indicates that one variable, A, is using the string.
B := A;
The reference count would be incremented by 1. The B structure would simply point to the same physical memory location as it would in variable A. The physical memory would contain the schema.
B := B + ' in spring';
Would cause a new string to be created with the added text. The string specific references in both cases would now become 1 such that:
A := 'Kangaroos have nice tails'; B := 'Kangaroos have nice tails in spring';
When a variable goes out of scope, the reference count for the actual string decrements. When a reference hits 0 the physical string can then be garbage collected.
This technique is used by Borland.
Just my 3.42 cents (Canadian)
thanx p davidson
Paul Davidson wrote:
Morning.
Several months ago someone on this list mentioned that GPC was slower than other Pascal compilers.
<rest snipped>
I think you missed the main point: GPC generates very fast code. So GPC compiled program run fast. The complaint is that GPC uses a lot of time to compile a program.
For poeple who want to know what GPC is doing: instead of guessing just give '--time-report' option to gpc. It will tell you which "passes" take most time. FYI, on i386 Linux typically with no optimisation about 30% is spent in front-end (reported as 'parser') and the rest is spent in the back-end. With optimisation and also when compiling for other processsors time spent in the back-end grows.
__ Waldek Hebisch hebisch@math.uni.wroc.pl or hebisch@hera.math.uni.wroc.pl
Paul Davidson wrote:
Several months ago someone on this list mentioned that GPC was slower than other Pascal compilers.
Would like to comment on one possible cause for this. Please keep in mind than am not familiar with GPC internals. (i.e. Ignorance has never stopped opinions in the past:)
One possible cause for this lack of speed may be due to string handling. The schema used for strings has an allocated length (max allowed) but does not seem to have an actual length.
Yes, it does. (It does not use or need a #0 terminator, except when dealing with "CStrings".)
Another trick that could improve string performance is reference counting.
This is a planned feature for a new kind of strings. It could reduce overhead in cases where strings are mostly copied around. But as you say, this will require quite some work in the runtime and probably even more in the compiler. (For now, there are more urgent things to do.)
Waldek Hebisch wrote:
I think you missed the main point: GPC generates very fast code. So GPC compiled program run fast.
To be fair, it may be somewhat slower when dealing a lot with strings (especially compared to hand-optimized C/assembler code ...). This also depends on the Pascal code -- e.g., using `const' string parameters are a lot faster than value parameters when the actual parameter is a string variable.
I'm not completely sure if there are cases left where GPC copies the capacity (allocated length) instead of the length. This could be a major slowdown if the length is much shorter than the capacity. I tried to eliminate such cases, but if you find them, let me know. (This can be observed from Pascal code by testing the runtime difference when using strings of the same capacity and vastly different length.)
Frank
I'm not completely sure if there are cases left where GPC copies the capacity (allocated length) instead of the length. This could be a major slowdown if the length is much shorter than the capacity. I tried to eliminate such cases, but if you find them, let me know. (This can be observed from Pascal code by testing the runtime difference when using strings of the same capacity and vastly different length.)
One thing for users to watch out for given this (copying by length, not capacity, which is good) is to be very careful if you store such strings in a file on disk in an unpacked manner (ie, you store the entire string). This sort of thing can leak sensitive information like passwords or financial data or whatever happens to be sitting around in memory and care should be taken to avoid such leakage (this was a common problem with Word documents of old).
I'm not really familiar with Pascal I/O (since I always use native Mac file I/O or POSIX file I/O), but it'd be interesting to know what a "file of String(255)" writes to the file if you write a string with a short length and garbage in the spare bytes...
Anyway, just a warning/reminder, Peter.
Peter N Lewis wrote:
One thing for users to watch out for given this (copying by length, not capacity, which is good) is to be very careful if you store such strings in a file on disk in an unpacked manner (ie, you store the entire string). This sort of thing can leak sensitive information like passwords or financial data or whatever happens to be sitting around in memory and care should be taken to avoid such leakage
Indeed.
(this was a common problem with Word documents of old).
Not only old from what I read. (Except that maybe today they can leak only data from other documents within the application, not from other processes as they did in old Windows versions.)
I'm not really familiar with Pascal I/O (since I always use native Mac file I/O or POSIX file I/O), but it'd be interesting to know what a "file of String(255)" writes to the file if you write a string with a short length and garbage in the spare bytes...
Such files will write just what's there, i.e. the garbage. Furthermore, such files are (very much) compiler-dependent, and (quite a bit) platform dependent (since the size of the capacity and length fields and possibly the alignment varies, and because of endianness).
If you cannot use a text file, you might want to look at routines such as WriteStringLittleEndian which take care of all these issues.
Frank