Hi, all
I've been working with the Mersenne Twister random number generator for a while. I translated it from the original C to Pascal several years ago, and the GPC team incorporated it into the GPC unit. But since then the original authors have found and fixed a bug in their initilization procedures, and other users have suggested various speedups.
I've now updated my original translation, validated it against the authors' test output, performed some other minor tests, and done what small optimizations I know how to do. Source for the unit and a test program are appended below.
I don't know much about optimization, and there may be some compiler directives that would greatly speed the unit. I'd appreciate any comments on that front: currently I only use: {$no-pack-struct, maximum-field-alignment 0} and more could probably be done.
regards, Toby
------------------------------------------ Unit MTrand; { Mersenne Twister random number generator -- a Pascal unit. Based on code by Makoto Matsumoto, Takuji Nishimura, Shawn Cokus, Richard J. Wagner, and others. This code is based on the Jan 26 2002 version, found on Matsumoto's MT website: http://konatu.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html See also the original website: http://www.math.keio.ac.jp/~matumoto/emt.html
Translation to a GPC unit by Toby Ewing (ewing@iastate.edu), March 2004, with assistance from James Houston (jhouston@tsargent.com).
The Mersenne Twister is an algorithm for generating random numbers. It was designed with consideration of the flaws in various other generators. The period, 2^19937-1, and the order of equidistribution, 623 dimensions, are far greater. The generator is also fast; it avoids multiplication and division, and it benefits from caches and pipelines.
Reference: M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator", ACM Transactions on Modeling and Computer Simulation, Vol. 8, No. 1, January 1998, pp 3-30. Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. The names of its contributors may not be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The original code included the following notice:
When you use this, send an email to: matumoto@math.keio.ac.jp with an appropriate reference to your work. }
{$no-pack-struct, maximum-field-alignment 0}
Interface
const mtN = 624; {length of state vector}
type MTcard = cardinal attribute (Size = 32); PMTary = ^Buffary; Buffary(m : medCard) = array[0..m] of medCard;
{Generate uniform random 32-bit cardinal} Function randInt : MTcard; {cardinal in [0,2^32-1]} Function randrInt(range : MTcard) : MTcard; {cardinal in [0,n-1]}
{Generate uniform random double} Function rand : double; {double in [0,1]} Function randr(range : double) : double; {double in [0,r]} Function randEx1 : double; {double in [0,1)} Function randrEx1(range : double) : double; {double in [0,r)} Function randEx0 : double; {double in (0,1]} Function randrEx0(range : double) : double; {double in (0,r]} Function randDblExc : double; {double in (0,1)} Function randrDblExc(range : double) : double; {double in (0,r)}
{Generate 53-bit uniform random double (capacity of IEEE double precision)} Function rand53 : double; {double in [0,1)}
{Generate random double from normal distribution, given mean and stdev} Function randNorm(Mean, stdev : double) : double;{double from (mean, stdev)}
{Initialization} Procedure SeedRand(RandSeed : MTcard); {initialize with an MTcard} Procedure SeedRandArray( keylen : medCard; {initialize with an array} var initkey : PMTary);
Implementation
const mtM = 397; {period parameter} vectA = 16#9908b0df; {constant vector a} UMask = 16#80000000; {most significant bits} LMask = 16#7fffffff; {least significant bits} TempB = 16#9d2c5680; {Tempering parameters B and C} TempC = 16#efc60000;
type MTary = array[0..mtN] of MTcard; magary = array[0..1] of MTcard;
var mt : MTary; {array for MT state vector} mti : MTcard; {array index}
Procedure Reload; {This is the basic vector generator, making mtN random MTcards at a time} var y, kk : MTcard; mag01 : magary = (0, vectA); attribute (static); begin for kk := 0 to mtN-mtM-1 do begin y := (mt[kk] AND UMask) OR (mt[kk+1] AND LMask); mt[kk] := mt[kk+mtM] XOR (y shr 1) XOR mag01[y AND 16#00000001]; end; for kk := mtN-mtM to mtN-1 do begin y := (mt[kk] AND UMask) OR (mt[kk+1] AND LMask); mt[kk] := mt[kk+(mtM-mtN)] XOR (y shr 1) XOR mag01[y AND 16#00000001]; end; y := (mt[mtN-1] AND UMask) OR (mt[0] AND LMask); mt[mtN-1] := mt[mtM-1] XOR (y shr 1) XOR mag01[y AND 16#00000001]; mti := 0; end;
Function RandInt : MTcard; {card in [0,2^32-1]} {This is the root function for returning a single random number: all of the other functions draw on this one.} var twist : MTcard; begin if (mti >= mtN) then reload; twist := mt[mti]; inc(mti); twist := twist XOR (twist shr 11); twist := twist XOR ((twist shl 7) AND TempB); twist := twist XOR ((twist shl 15) AND TempC); randInt := twist XOR (twist shr 18); end;
{Real versions (in C) by Isaku Wada} Function rand : double; {double in [0,1]} begin rand := randInt * (1.0/4294967295.0); end;
Function randr(range : double) : double; {double in [0,r]} begin randr := range * randInt * (1.0/4294967295.0); end;
Function randEx1 : double; {double in [0,1)} begin randEx1 := randInt * (1.0/4294967296.0); end;
Function randrEx1(range : double) : double; {double in [0,r)} begin randrEx1 := range * randInt * (1.0/4294967296.0); end;
Function randEx0 : double; {double in (0,1]} begin randEx0 := (randInt + 1.0) * (1.0/4294967295.0); end;
Function randrEx0(range : double) : double; {double in (0,r]} begin randrEx0 := range * (randInt + 1.0) * (1.0/4294967295.0); end;
Function randDblExc : double; {double in (0,1)} begin randDblExc := (randInt + 0.5) * (1.0/4294967296.0); end;
Function randrDblExc(range : double) : double; {double in (0,r)} begin randrDblExc := range * (randInt + 0.5) * (1.0/4294967296.0); end;
Function randrInt(range : MTcard) : MTcard; {card in [0,n-1]} { Return a MTcard within [0..range-1]. The code first finds which bits are used within that range. Optimized by Magnus Jonsson (magnus@smartelectronix.com). } var used, i : MTcard; begin used := range; used := used OR (used shr 1); used := used OR (used shr 2); used := used OR (used shr 4); used := used OR (used shr 8); used := used OR (used shr 16); repeat i := randInt AND used {Draw numbers until one is found in [0,n]} until (i <= range); {toss unused bits to shorten search} randrInt := i; end;
Function rand53 : double; {double in [0,1)} var a, b : MTcard; begin {code by Isaku Wada} a := randInt shr 5; b := randInt shr 6; rand53 := ((a * 67108864.0) + b) * (1.0/9007199254740992.0); end;
Function randNorm(mean, stdev : double) : double;{double from (mean, stdev)} { Draw a double from a normal (Gaussian) distribution with specified mean and standard deviation, using the Box-Muller method. This method is adapted from Press et al.'s Numerical Recipes. Wagner used the following instead: r := sqrt(-2.0 * ln(1.0-randDblExc)) * variance; phi := 2.0 * PI * randEx0; randNorm := mean + (variance * r * cos(phi)); It is simpler, but it calls for (almost) twice as many randInts, and uses a trigonometric function. According to my time trials, the Wagner method takes approximately 3 times longer than the Box-Muller I use here. } var RF, r1, r2 : double; rsecond : double; attribute (static); bsecond : Boolean = False; attribute (static); begin if bsecond then begin bsecond := False; randNorm := rsecond; end else begin bsecond := True; repeat r1 := randr(2.0) - 1.0; r2 := randr(2.0) - 1.0; RF := sqr(r1) + sqr(r2); until ((RF < 1.0) and (RF <> 0.0)); RF := stdev * sqrt(-2.0 * ln(RF)/RF); rsecond := mean + (r2 * RF); randNorm := mean + (r1 * RF); end; end;
Procedure SeedRand(RandSeed : MTcard); { Initialize generator state with seed. See Knuth TAOCP Vol 2, 3rd Ed, p.106 for multiplier. In previous versions, the most significant bits (MSBs) of the seed affected only the MSBs of the state array. Modified 9 Jan 2002 by Makoto Matsumoto. } begin mt[0] := RandSeed AND 16#ffffffff; for mti := 1 to mtN do mt[mti] := (1812433253 * (mt[mti-1] XOR (mt[mti-1] shr 30))+mti) AND 16#ffffffff; mti := mtN + 1; end;
Procedure SeedRandArray( keylen : medCard; var initkey : PMTary); var i, j, k : MTcard; begin SeedRand(19650218); i := 1; j := 0; k := Max(mtN, keylen); repeat dec(k); mt[i] := (mt[i] XOR ((mt[i-1] XOR (mt[i-1] shr 30)) * 1664525)) + initkey^[j+1] + j; {non-linear} mt[i] := mt[i] AND 16#ffffffff; {if machine wordsize > 32} inc(i); inc(j); if (i >= mtN) then begin mt[0] := mt[mtN-1]; i := 1; end; if (j >= keylen) then j := 0; until (k = 0); for k := mtN-1 downto 0 do begin mt[i] := (mt[i] XOR ((mt[i-1] XOR (mt[i-1] shr 30)) * 1566083941)) - i; mt[i] := mt[i] AND 16#ffffffff; {if machine wordsize > 32} inc(i); if (i >= mtN) then begin mt[0] := mt[mtN-1]; i := 1; end; end; mt[0] := 16#80000000; {force MSB = 1} end;
to begin do {Unit Initialization} begin {initialize state vector} SeedRand(5489); {this is Matsumoto's default} end;
end. {unit MTrand}
------------------------------------------ program TestMTrand;
uses MTrand, GPC;
const big = 100000000; notsobig = 10000000; bin = 40; mean = 100.0; std = 20.0;
type binary = array[1..bin] of longCard;
var time1, time2, mult, i : longCard; m, s, rj : double; keylen, jj : medCard; bins : binary; init : PMTary; {This type is native to MTrand: PMTary = ^Buffary; Buffary(m : medCard) = array[0..m] of medCard;} begin writeln; writeln('Testing the Mersenne Twister Unit ''MTrand.pas'''); writeln;
(* {This part passed March 23, 2004 in test against Matsumoto's original test output. Not implemented in this test program because of subsequent changes in the MT initialization procedures.} writeln('Precision: compare with Matsumoto''s 1000 original integer output'); SeedRand(4357); for i := 1 to 1000 do begin write(randInt:10, ' '); if ((i mod 5) = 0) then writeln; end; writeln; *)
{This part passed March 24, 2004 in test against Matsumoto's output} writeln('Precision: compare with Matsumoto''s 1000 integer output, posted at'); writeln(' http://www.math.sci.hiroshima-u.ac.jp/~'); writeln(' m-mat/MT/MT2002/CODES/MTARCOK/mt19937ar-cok.out'); New(init, 4); init^[1] := 16#00000123; init^[2] := 16#00000234; init^[3] := 16#00000345; init^[4] := 16#00000456; keylen := 4; SeedRandArray(keylen, init); Dispose(init); for i := 1 to 1000 do begin write(randInt:10, ' '); if ((i mod 5) = 0) then writeln; end; writeln;
{This part passed March 23, 2004} writeln('Uniformity of random 32-bit cardinals:'); for i := 1 to bin do bins[i] := 0; mult := bin; SeedRand(1234567890); for i := 1 to big do inc(bins[1+(randInt*mult) div 16#ffffffff]); for i := 1 to bin do begin write(i:2, ': ', bins[i]:7, ' '); if ((i mod 5) = 0) then writeln; end; writeln;
writeln('Uniformity of range-specified cardinals:'); for i := 1 to bin do bins[i] := 0; mult := bin; SeedRand(1234567890); for i := 1 to big do inc(bins[1+(randrInt(bin))]); for i := 1 to bin do begin write(i:2, ': ', bins[i]:7, ' '); if ((i mod 5) = 0) then writeln; end; writeln;
writeln; writeln('Distribution of doubles in [0..1]:'); for i := 1 to bin do bins[i] := 0; m := 0.0; s := 0.0; SeedRand(987654321); for i := 1 to notsobig do begin rj := rand; m := m + rj; s := s + sqr(rj); end; m := m/notsobig; s := sqrt((s-(notsobig*sqr(m))) / (notsobig-1)); writeln('Theoretical mean: 0.5000, standard deviation: 0.2886'); writeln('Returned mean : ', m:7:4, ', standard deviation: ', s:7:4);
writeln; writeln('Distribution of Random normal deviates with known mean and standard deviation:'); for i := 1 to bin do bins[i] := 0; m := 0.0; s := 0.0; SeedRand(2468097531); for i := 1 to notsobig do begin rj := randNorm(mean, std); m := m + rj; s := s + sqr(rj); end; m := m/notsobig; s := sqrt((s-(notsobig*sqr(m))) / (notsobig-1)); writeln('Requested mean: ', mean:7:4, ', standard deviation: ', std:7:4); writeln('Returned mean: ', m:7:4, ', standard deviation: ', s:7:4); writeln;
{This part tested and used to optimize the unit, March 23-24, 2004} writeln('Timing:'); SeedRand(1357908642); time1 := GetMicroSecondTime; for i := 1 to big do jj := randInt; time2 := GetMicroSecondTime; writeln('Time for ', big:1, ' randInts: ', ((time2-time1)*0.000001):7:2, ' seconds.');
SeedRand(564739201); time1 := GetMicroSecondTime; for i := 1 to big do rj := Rand; time2 := GetMicroSecondTime; writeln('Time for ', big:1, ' rands: ', ((time2-time1)*0.000001):7:2, ' seconds.');
SeedRand(1029384756); time1 := GetMicroSecondTime; for i := 1 to big do rj := randNorm(mean, std); time2 := GetMicroSecondTime; writeln('Time for ', big:1, ' randNorms: ', ((time2-time1)*0.000001):7:2, ' seconds.'); writeln; writeln('done.');
end.
Toby Ewing wrote:
I've been working with the Mersenne Twister random number generator for a while. I translated it from the original C to Pascal several years ago, and the GPC team incorporated it into the GPC unit. But since then the original authors have found and fixed a bug in their initilization procedures, and other users have suggested various speedups.
I've now updated my original translation, validated it against the authors' test output, performed some other minor tests, and done what small optimizations I know how to do. Source for the unit and a test program are appended below.
I've briefly looked at your code, and it seems there are some off-by-one errors, e.g.:
for kk := mtN-mtM to mtN-1 do begin
(where the C version says `<N-1'). I think there are more.
Also, if you'd like to have this code included with GPC, it would be good to use the same coding style and interfaces (including function names such as `Default_RandInt' which are called through pointers in the RTS). I don't have the time to do the reformatting myself now. It would also make it easier for me to see what's actually changed. I remember having updated the initalization, apparently corresponding to the "2002/01/09" changes. I don't know if anything significant was changed after that.
Frank
Hi, Frank
Frank Heckenbach wrote:
Toby Ewing wrote:
I've been working with the Mersenne Twister random number generator for a while. I translated it from the original C to Pascal several years ago, and the GPC team incorporated it into the GPC unit. But since then the original authors have found and fixed a bug in their initilization procedures, and other users have suggested various speedups.
I've now updated my original translation, validated it against the authors' test output, performed some other minor tests, and done what small optimizations I know how to do. Source for the unit and a test program are appended below.
I've briefly looked at your code, and it seems there are some off-by-one errors, e.g.:
for kk := mtN-mtM to mtN-1 do begin
(where the C version says `<N-1'). I think there are more.
There are indeed several such differences, but they're not errors as such. I used the Pascal [1..n] convention, rather than staying with the C [0..n-1] convention. The bottom line is that the output is identical, as tested against several test output files on the Mersenne Twister homepage.
Also, if you'd like to have this code included with GPC, it would be good to use the same coding style and interfaces (including function names such as `Default_RandInt' which are called through pointers in the RTS). I don't have the time to do the reformatting myself now.
The basics - initialization from a given random number seed, and the "rand" that returns a real in [0..1) - are currently part of the gpc.pas unit. I'd be happy to have the current code included - that was my purpose in submitting it - but I don't understand the coding and interface conventions you're using. If you can point me to some clear guidelines, I could give it a shot.
The code that I started with, the C++ abyss from which I rescued some civilized Pascal, claims to be faster than what I wrote. It uses the C and C++ directives "inline" and "register", and some arcane C constructions that are better avoided in clear code. In addition, there is assembly code optimized for pentia with MMX. I decided to make my Pascal code platform- and dialect-independent, but there is (theoretically) yet faster code out there, and if you'd rather link that into the GPC unit, that's fine too.
It would also make it easier for me to see what's actually changed. I remember having updated the initalization, apparently corresponding to the "2002/01/09" changes. I don't know if anything significant was changed after that.
I didn't realize that you'd changed the initialization already - that's great!
The other changes are 1) I split the basic number generator into two parts - one that is executed for each number, the other of which is executed only once every 624 calls. In my tests, this resulted in a 20+% speedup, presumably because the memory overhead for the more frequent function call is greatly reduced. 2) I added code for generating several commonly desired variants: reals in [0..1), reals in [0..1], reals in (0..1], doubles with random values throughout their full 53-bit (IEEE) length, numbers drawn from a normal distribution...
regards, Toby
Toby Ewing wrote:
Frank Heckenbach wrote:
Toby Ewing wrote:
I've been working with the Mersenne Twister random number generator for a while. I translated it from the original C to Pascal several years ago, and the GPC team incorporated it into the GPC unit. But since then the original authors have found and fixed a bug in their initilization procedures, and other users have suggested various speedups.
I've now updated my original translation, validated it against the authors' test output, performed some other minor tests, and done what small optimizations I know how to do. Source for the unit and a test program are appended below.
I've briefly looked at your code, and it seems there are some off-by-one errors, e.g.:
for kk := mtN-mtM to mtN-1 do begin
(where the C version says `<N-1'). I think there are more.
There are indeed several such differences, but they're not errors as such. I used the Pascal [1..n] convention, rather than staying with the C [0..n-1] convention.
Then these lines seem to be wrong:
type MTary = array[0..mtN] of MTcard; magary = array[0..1] of MTcard;
for kk := 0 to mtN-mtM-1 do begin
(perhaps more)
Well, `[0 .. mtN]' is one too large for C's `[mtN]' in any case. Maybe the extra element is just declared and computed but never used, so the results are the same, but it's still a bit strange.
Also, if you'd like to have this code included with GPC, it would be good to use the same coding style and interfaces (including function names such as `Default_RandInt' which are called through pointers in the RTS). I don't have the time to do the reformatting myself now.
The basics - initialization from a given random number seed, and the "rand" that returns a real in [0..1) - are currently part of the gpc.pas unit.
The actual code is in rts/random.pas in the GPC sources. gpc.pas just contains the interface.
I'd be happy to have the current code included - that was my purpose in submitting it - but I don't understand the coding and interface conventions you're using. If you can point me to some clear guidelines, I could give it a shot.
See the info file GPCS that comes with GPC (long) and/or just follow the style and interfaces in random.pas. I'd also appreciate if you keep the identifier names as far as possible so it's easy to see what's actually changed.
If you want to add more interfaces, I'd agree if they're nontrivial enough -- but in contrast e.g. `randrEx1 (range)' is equivalent to (and in fact does the same as) `range * Random' when `Random' calls `Default_RandReal' internally, so I don't think it justifies another RTS function.
I also doubt whether the bit-fiddling in `randrInt' is really better than our `Default_RandInt' code (which looks more complicated, but only because it also provides a 64 bit range -- though it may be worthwhile to add a "fast path" for MaxValue <= RandomRange there). Assuming that the random operation is the most costly part, using `mod' seems better than `and'ing out bits -- how much, of course, depends on the number, e.g. for 0 .. 32, the `and' version will require almost two calls on average (64/33) whereas the `mod' version takes just slightly more than one (1 + 4/(2^32-4)).
At a quick glance, also `randNorm' seems problematic since if `bsecond' is set, it ignores its arguments.
The code that I started with, the C++ abyss from which I rescued some civilized Pascal, claims to be faster than what I wrote. It uses the C and C++ directives "inline" and "register", and some arcane C constructions that are better avoided in clear code.
Yes, GPC provides them as attributes, but with `-O3' for `inline' and `-O' for `register', it also does it automatically, so they're recommended only in very special cases.
The other changes are
- I split the basic number generator into two parts - one that is executed for
each number, the other of which is executed only once every 624 calls. In my tests, this resulted in a 20+% speedup, presumably because the memory overhead for the more frequent function call is greatly reduced.
Yes, this may be worthwhile (with inlining).
Frank
Hi, all
Frank Heckenbach wrote:
Toby Ewing wrote:
Frank Heckenbach wrote:
I've briefly looked at your code, and it seems there are some off-by-one errors, e.g.:
for kk := mtN-mtM to mtN-1 do begin
(where the C version says `<N-1'). I think there are more.
There are indeed several such differences, but they're not errors as such. I used the Pascal [1..n] convention, rather than staying with the C [0..n-1] convention.
Then these lines seem to be wrong:
type MTary = array[0..mtN] of MTcard; magary = array[0..1] of MTcard;
for kk := 0 to mtN-mtM-1 do begin
Well, `[0 .. mtN]' is one too large for C's `[mtN]' in any case. Maybe the extra element is just declared and computed but never used, so the results are the same, but it's still a bit strange.
I mis-spoke above. The code uses mt[mtN] during initialization, but the generated random numbers are stored in mt[0..mtN-1].
See the info file GPCS that comes with GPC (long) and/or just follow the style and interfaces in random.pas. I'd also appreciate if you keep the identifier names as far as possible so it's easy to see what's actually changed.
thanks for the pointer.
If you want to add more interfaces, I'd agree if they're nontrivial enough -- but in contrast e.g. `randrEx1 (range)' is equivalent to (and in fact does the same as) `range * Random' when `Random' calls `Default_RandReal' internally, so I don't think it justifies another RTS function.
I agree in the specific case of randrEx1 - it is trivial. But it was included to fill out the complete suite of random doubles in: [0..1) [0..r) [0..1] [0..r] (0..1] (0..r] (0..1) (0..r) which may look like overkill, but there are times for each.
I also doubt whether the bit-fiddling in `randrInt' is really better than our `Default_RandInt' code (which looks more complicated, but only because it also provides a 64 bit range -- though it may be worthwhile to add a "fast path" for MaxValue <= RandomRange there). Assuming that the random operation is the most costly part, using `mod' seems better than `and'ing out bits -- how much, of course, depends on the number, e.g. for 0 .. 32, the `and' version will require almost two calls on average (64/33) whereas the `mod' version takes just slightly more than one (1 + 4/(2^32-4)).
Can't say offhand. I'll look into this one, and get back to you.
At a quick glance, also `randNorm' seems problematic since if `bsecond' is set, it ignores its arguments.
This can look confusing at first glance, but it's actually a rather nice algorithm. The Box-Muller algorithm (old and venerable) takes as arguments 2 uniform random deviates (random numbers in [0..1]) and returns 2 uncorrelated numbers drawn from a normal distribution. If you generate 2 uniform random deviates each time you ask for one normally distributed number, then you're throwing away a lot of effort. The function as I wrote it has two static variables (1 boolean, 1 real). The boolean is initialized as false. When the boolean is false, the function generates 2 random numbers, stores 1 (statically), sets the boolean to true, and returns the other random number. If the boolean is true (i.e. next time), the function sets the boolean to false and returns the stored number. It's definitely faster than the C++ method I started with, which only generates 1 random normal for everly 2 uniform random deviates.
The code that I started with, the C++ abyss from which I rescued some civilized Pascal, claims to be faster than what I wrote. It uses the C and C++ directives "inline" and "register", and some arcane C constructions that are better avoided in clear code.
Yes, GPC provides them as attributes, but with `-O3' for `inline' and `-O' for `register', it also does it automatically, so they're recommended only in very special cases.
I found that the speedup resulting from going from -O2 to -O3 was quite significant. Presumably that is due to the inlining. I'm not familiar with the -O attribute: can you please say more about how and where to use it?
thanks, Toby
Toby Ewing wrote:
If you want to add more interfaces, I'd agree if they're nontrivial enough -- but in contrast e.g. `randrEx1 (range)' is equivalent to (and in fact does the same as) `range * Random' when `Random' calls `Default_RandReal' internally, so I don't think it justifies another RTS function.
I agree in the specific case of randrEx1 - it is trivial. But it was included to fill out the complete suite of random doubles in: [0..1) [0..r) [0..1] [0..r] (0..1] (0..r] (0..1) (0..r) which may look like overkill, but there are times for each.
<0..r> = r * <0..1> (for <> in { [], () } ;-)
(0..1] = 1 - [0..1)
[0..1] and (0..1) may be useful (though, still, they're one-liners, just as the existing [0..1) ).
At a quick glance, also `randNorm' seems problematic since if `bsecond' is set, it ignores its arguments.
This can look confusing at first glance, but it's actually a rather nice algorithm.
I'm not doubting the algorithm, just an implementation detail (see my comment above). It seems fixable with moderate effort.
The code that I started with, the C++ abyss from which I rescued some civilized Pascal, claims to be faster than what I wrote. It uses the C and C++ directives "inline" and "register", and some arcane C constructions that are better avoided in clear code.
Yes, GPC provides them as attributes, but with `-O3' for `inline' and `-O' for `register', it also does it automatically, so they're recommended only in very special cases.
I found that the speedup resulting from going from -O2 to -O3 was quite significant. Presumably that is due to the inlining. I'm not familiar with the -O attribute: can you please say more about how and where to use it?
Basically, without `-O', the generated code is very bad. You can do without it if compile times are important and/or runtime is neglectible. Otherwise at least `-O' is recommended. (In particular, timings without `-O' are pointless.)
`-O2' generally improves the generated code significantly, but also compile-time quite significantly. Many packages use this as the default.
The only difference in `-O3' (unless that's changed in newer backends) is automatic inling. `-O2' should work just as well if you declare the critical routines with `attribute (inline)' (as I said, I don't recommend this in general, but for certain cases, such as here, I'd accept it).
Frank