Kevan Hashemi wrote:
Here is my GPC test program:
program test;
var a:real; m,n,p:integer;
begin a:=1234.567; writeln('starting loop...'); for m:=0 to 1000 do for n:=0 to 1000 do begin for p:=1 to 100 do begin { loop statement here } end; end; writeln('done.'); end.
In CW the code looks the same, but the variables are longreals and longints so that they match the GPC sizes. As you can see, the loop statement gets executed 100,000,000 times.
For-loops themselves are suspect in GPC, see the thread "A littlle benchmark" in the GPC mailing list archives:
<http://www.gnu-pascal.de/crystal/gpc/en/ mail7480.html?pos=23012761#23012761> <http://www.gnu-pascal.de/crystal/gpc/en/ mail7471.html?pos=22983274#22983274>
Frank Heckenbach writes there:
Somehow, the backend's loop optimizations don't recognize GPC's `for' loops optimally. Maybe GPC's handling of `for' loops could be changed (but again, it's hairy, so watch out ...), or it should be improved in the backend, I'm not sure now ...
If Frank doesn't know, I certainly don't know either. You may want to try a repeat-until loop instead.
I measure the execution time by counting seconds in my head between the prints to the console.
By the way, you can also use 'TickCount' (which returns a count in 1/60 seconds) or 'Microseconds'. Both are in the ported GPCPInterfaces (available from my website). They require linking with the 'Carbon framework'.
I'm using an 800 MHz iBook. Here are my results:
loop CW time GPC time statement (s) (s)
none 1 3 a:=a*a/a 6 13 a:=p 3 6 a:=round(a) 8 40 a:=sin(p) 15 40
Most things take two or three times as long with GPC. I expect code running on MacOS X (UNIX) to be slower than code on MacOS 9 because MacOS X is re-entrant, is subject to pre-emptive multitasking, and provides protected memory.
Not much slower on Mac OS X, I guess, except in using QuickDraw and the like.
I am more interested in the fact that the GPC round() function takes four times as long as the GPC implementation of a:=a*a/a, while the CW round() function takes about the same time as the CW implementation of a:=a*a/a.
As you know, rounding a number with platform-independent mathematical functions is slow. The CW round() probably uses the Power PC real number format to abbreviate the rounding process. Perhaps GPC uses a platform- independent implementation.
I have always used round() with sinusoidal look-up tables in my fourier transforms. The above results suggest that I gained very little by doing so. Nevertheless, I also use round() to obtain display coordinates from real-valued graphs, and these routines are running five times slower than before.
If you look in the fp.pas unit provided with the ported GPCPInterfaces, you will find there a wealth of mathematical routines, e.g.
* rint Rounds its argument to an integral value in floating point * * format, honoring the current rounding direction. * * * * nearbyint Differs from rint only in that it does not raise the inexact * * exception. It is the nearbyint function recommended by the * * IEEE floating-point standard 854. * * * * rinttol Rounds its argument to the nearest long int using the current * * rounding direction. NOTE: if the rounded value is outside * * the range of long int, then the result is undefined. * * * * round Rounds the argument to the nearest integral value in floating * * point format similar to the Fortran "anint" function. That is: * * add half to the magnitude and chop. * * * * roundtol Similar to the Fortran function nint or to the Pascal round. * * NOTE: if the rounded value is outside the range of long int, * * then the result is undefined. * * * * trunc Computes the integral value, in floating format, nearest to * * but no larger in magnitude than its argument. NOTE: on 68K * * compilers when using -elems881, trunc must return an int *
You can use these routines with GPC instead of the built-in GPC runtime routines, to see if that makes any difference. Apple has always meticulously followed the IEEE standards (see the foreword of Professor W. Kahan in the Apple Numerics manual, second edition). To use them, you need to link with the Carbon framework.
I haven't checked if this automatically links in the requested routine for functions with the same name, e.g. 'sin'. If not, you may experiment with:
* linking order of included libraries on the command line * renaming declarations in GPCPInterfaces or inserting the declarations directly in your source code
Please, let us know if this works !
As a side bar, I want to mention Motorola's Libmoto for the PowerPC processor
<http://e-www.motorola.com/webapp/sps/site/ prod_summary.jsp?code=LIBMOTO>
For benchmarks, go to http://developer.apple.com/ and search for "libmoto". However, the library is inaccurate in edge-case conditions. I used it a while in my CAD software, but later dropped it. Besides, I don't know of a Mac OS X port and Motorola no longer supports the "product", which is a standard problem for much software that is a "product" ...
Regards,
Adriaan van Os
P.S. A separate post on optimization issues on Mac OS X will follow.