[Fwd: Re: Speed] - Gpc

17 Jun 2001


      Miklos Cserzo wrote:
...
Hi Folks,
I am optimizing a code for speed in gpc. Let me share some strange
observations of mine.
I have two vectors of size N and M. I need to perform a not too complex
calculation between all the elements (NxM) two times. I thought I dump the
result into an array of NxM and second time just read the array back and
this should save me time on the expense of memory. In fact this is about
50% slower than generating the result dynamically twice!!! Is it normal in
gpc? Why the handling of large arrays are so slow?
I have only the following experience with large arrays on gpc/djgpp (DOS):
There is a 50% - 100% variation in execution times due to memory cache
considerations (this cache is presumably the "level 2" 256 or 512 kB cache of
the Pentiums). Matrices are stored in pascal row-wise (same for C but the
opposite is true for fortran). If you access elements row after row you minimize
cache breaks. In a procedure to diagonalize symmetrical matrices, just swapping
the indexes to access memory elements in right order gave that gain. The exact
value of the gain depends of course on the size of the matrices (there is no
gain for small matrices diagonalized many times), but also apparently on the
version of the pentium. It depends also on the optimization switches of the
compiler: there is no gain with BP which is not execution-optimized and thus
very slow. Similarly in every textbook on these procedures is explained a
"clever" trick: you can use only the lower (or higher) half of the matrix (due
to symmetry) to store elements and use the remaining to store intermediate
results needed later. But this also disrupts the row-wise use of elements.
Reorganizing the same algorithm to use the full matrix in order to access as far
as possible row-wise and recomputing twice the intermediate results needed gives
a gain of the same order of magnitude. The loss in recomputing is only of order
N^2, while the gain in better elements access is of order N^3 !
So how have you stored and accessed your NxM elements ?
Maurice
PS this is different of the "page breaks" you can monitor in linux: these page
breaks correspond to breaks due to virtual memory use. Here it is a standalone
programm all in real memory (bare DOS with CWSDPMI).
-- 
         Maurice Lombardi
Laboratoire de  Spectrometrie Physique,
Universite Joseph Fourier de Grenoble, BP87
38402 Saint Martin d'Heres Cedex     FRANCE
Tel: 33 (0)4 76 51 47 51
Fax: 33 (0)4 76 51 45 44
mailto:Maurice.Lombardi@ujf-grenoble.fr