On Thu, 15 May 2003, Emil Jerabek wrote:
On Tue, May 13, 2003 at 12:47:23PM +0200, Mirsad Todorovac wrote:
Hi, Frank, others!
Here is the first test I wrote using GPC manual's instructions for writting randomized tests.
It concerns sets, since they have been a place where bugs appeared in the past - even though Frank's implementation seems invariant to number of bits ``MedCard'' type could have sometimes in the future.
BTW, there seems to be a need for something called a "random set" of data, pehaps even with given ``Monte Carlo'' function, to get a different distibution than uniform one.
The most plain random set could be the one with its MedCards initialized with random numbers with probably 0.5 probability for each bit being set.
Now I looked into the implementation, and it does something quite different:
for n := 1 to MaxN do begin A := []; for j := 0 to n do Include (A, Random(MaxN));
This makes the probability of a bit being set 1/e=0.3678..., not 0.5 (moreover the bits are far from being independent).
Thank you, Emil, for pointing out this -- the probability is not 0.5, except perhaps in case where n = MaxN/2. It's approximatelly p = n/MaxN.
Bits are as independent as results from Random(MaxN) are, IMHO: if Random(MaxN) gives always a result in range 0..MaxN-1, the probability of each bit being set should be slightly lower than n/MaxN.
Lower because a bit can be set twice, or more times, due to the randomized nature of the test, which in result gives less than n bits set out of total MaxN.
Other than this case (probability for overlap rises with set density), the distribution of set elements if expected to be rather uniform one.
Also, the diagnostics on failure won't work:
if Failed then begin WriteLn ('failed (', RandomSeed, ') ', 'After Include (setA, ', j1, ') ', j1, ' NOT IN!!!'); WriteLn ('dumping set [0..', MaxN,']:'); for j := 0 to MaxN div SetWordSize do WriteLn (j); Halt end
This just prints the same sequence of numbers every time, something like `Bar[j]' is missing here, I guess.
You're absolutely right, Emil -- Bar[j] is exactly what was supposed to be here.
Considering use of Bar[j], here is a method of getting a fair 0.5 probability of each bit being set:
m := MaxN div SetWordSize - 1; for j := 0 to m do Bar[j] := Random (High (MedCard));
However, this also relies on Random () providing pseudo-random numbers of good quality, with 0.5 probability of each bit being set. True, it's also much faster than initializing bit by bit via Include (SetA, Random(MaxN)).
We could read from some real random source if we need truly random numbers, such as required in security, but I think Random() function suffices here, wouldn't you agree?
-----
1. The problem is, however, how to efficiently generate a uniform RandomSet() with p <> 0.5 with Bar[j in 0..m] := Random(High(MedCard))!
-- we need here a completely new type of pseudo-random function here!
2. Another problem is with Include (SetA, Random(MaxN)), i in 1..n method that the more dense SetA is, the more set bits Include()-ed will overlap.
The probability that we bump into already set bit is i/MaxN, which can grow near to 1.0, which means, as you see, that total probability may still be uniform, but it will drop bellow desired p = n / maxN
Please help if you have a suggestion how to generate these (uniform random sets with p (i in SetA) = const), please come forth.
Mirsad
P.S. thank you for the set dump fix.