[pycrypto] Issue with the new random.choice() unit test

Tue Jan 4 17:24:14 CST 2011

Well, this problem exist in all your random test although it is indeed
unlikely that one will hit the others.
I personally wouldn't worry to much about the exact distribution of most
random tests. They all use random.randrange which in turn uses getrandbits.
So if you want to go for *real* tests those functions should be the focus.
Although proper test for the others functions would of course also be good
I personally would settle for simple sanity checks (i.e., just increase the
seq size or wrap the part with a try-except block in which you repeat the test.
If it fails again raise an exception)

By the way:
  * You don't need to create the seq randomly or do you? wouldn't range(500) suffice?
  * I find the way you fixed the shuffle function rather odd (good catch, though)
    my suggestion would be
	for i in xrange(len(x)):
		p = self.randrange(len(items))
		x[i] = items[p]
		del items[p]
    or even shorter
	for i in xrange(len(x)):
		x[i] = items.pop(self.randrange(len(items)))

sincerely yours,
//Lorenz

On 01/04/2011 09:40 PM, Thorsten Behrens wrote:
> I have introduced a unit test in test_random.py that has too high a rate
> of failure. Specifically, this:
>
>           # Test choice
>           seq = []
>           for i in range(500): # seed the sequence
>               seq[i:] = [random.getrandbits(32)]
>           x = random.choice(seq)
>           y = random.choice(seq)
>           self.assertNotEqual(x, y)
>
> just produced a FAIL:
>
> FAIL: runTest (Crypto.SelfTest.Random.test_random.SimpleTest)
> Crypto.Random.new()
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>     File "build\lib.win-amd64-2.7\Crypto\SelfTest\Random\test_random.py",
> line 103, in runTest
>       self.assertNotEqual(x, y)
> AssertionError: 1793595220L == 1793595220L
>
>
> Well darn. I guess saying "hey it's a 1 in 500 chance, it'll never
> fail!" is indeed naive. What would be a less naive test, then? I am
> thinking seeding a much smaller seq, and then running choice many times,
> counting collisions each time, and getting some form of expected value
> with an expected precision from that. It's been very long since I've
> done stochastic stuff, however. Before I screw this up further: Concrete
> suggestions on how to fix this unit test?
>
> Thanks!
> Thorsten