Re: GPC speed

27 Oct 2004


      Dear Waldek,
thanks very much for your suggestions. As for the optimisation switches, 
-march=athlon-xp doesn't seem to work on my system 
(Mandrake 9.1, gpc version 2.1 (20020510), based on 2.95.3 20010315 
(release)). Do I have to compile or install GPC with particular options?
If I use doubles instead of integers, do you think that the performances will 
again enjoy this enhancement?
Thanks, best regards
Silvio a Beccara
| Silvio a Beccara wrote:
| > The random generator is in both cases the Mersenne Twister (routine
| > mt19937 from the authors' site). And I had forgot the Fortran program.
| > Anyway you are right: the pseudorandom generator takes the most time in
| > the program. But here is another example, this time both  in Pascal and
| > in Fortran:
|
| In Pascal program you have:
| > type tMatrix = array[0..size, 0..size] of longint;
|
| In Fortran:
| > 	integer m1 ( 0:size, 0:size), m2 ( 0:size, 0:size),
|
| AFAIK Fortran integers on x86 are 32 bit, but GPC longint is 64 bit
| and highier precision has it cost. Also, Fortran passes array by reference,
|
| but you passed arguments to 'mmult' by value:
| > procedure mmult(rows, cols : integer; m1, m2 : tMatrix; var mm : tMatrix
| > );
|
| to pass by reference you can use 'const' attribute:
| procedure mmult(rows, cols : integer; const m1, m2 : tMatrix; var mm :
| tMatrix );
|
| I have modified your program to use integer instead of longint and
| to use 'const' attribute (as above). Also tried a two versions of
| GPC with different options:
|                               original          modified
| gpc-20041017+gcc-3.3.5
| -O2 -march=athlon-xp           0.035884         0.008142
| gpc-20041017+gcc-3.3.5
| -O2 -march=i686                0.033786         0.008811
| gpc-20030830+gcc-3.3.2
| -O2 -march=athlon-xp           0.037098         0.013156
| gpc-20020510+gcc-2.95.3
| -O2 -march=i386                0.039530         0.025574
| gpc-20020510+gcc-2.95.3
| -O2 -march=i686                0.037567         0.026358
|
|
| As you can see with new gpc modified version is 4 times faster
| then original. Main gain comes from reduced precision but
| optimizing for correct processor gives 10% and 'const' attribute
| another 10%. The fastest version takes 6.28 clocks per inner
| loop iteration which still looks too high for me. But it seem
| hard to get better speed without significantly changing
| program.
|
| By the way, if what you want is matrix multiplicatin than using
| Atlas library may be a solution: Atlas is hand optimized and
| IMHO hard to beat. IIRC Atlas is floating point only, but
| converting to integer to doubles, doing floating point matrix
| multiply and converting back to integers is likely to be
| faster then direct integer matrix multiply (integer and floating
| point arithmetic are of similar speed, but floating point
| registers are separate from integer registers, so floating point
| program effectively have more registers to use).

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: GPC speed