randtests¶

Statistical tests for randomness

This subpackage comprises of implementations of statistical tests laid out in a comprehensive paper from NIST [1] in regards to assessing (pseudo-)random number generators.

Notes

A copy of the NIST paper can be found at the root of the coinflip repository as SP800-22.pdf.

The test themselves are defined in section 2., “Random Number Generation Tests”, p. 23-62. Further detail of the tests are provided in section 3. “Technical Descriptions of Tests”, p. 63-87.

These tests were implemented in a complimentary program sts, which can be downloaded from the NIST website.

Note that the paper assumes a great amount of familiarity with certain concepts in statistics. It also uses some constants and algorithms without any explaination. Part of the purpose for coinflip is to “describe” the NIST tests more wholly than in the paper itself, whilst also reducing the noise of some non-idiomatic programming conventions used in sts.

[1]

National Institute of Standards and Technology <Andrew Rukhin, Juan Soto, James Nechvatal, Miles Smid, Elaine Barker, Stefan Leigh, Mark Levenson, Mark Vangel, David Banks, Alan Heckert, James Dray, San Vo, Lawrence E Bassham II>, “A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications”, Special Publication 800-22 Revision 1a, April 2010.

coinflip.randtests.monobit(sequence)[source]¶

Proportion of values is compared to expected 1:1 ratio

The difference between the frequency of the two values is found, and referenced to a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested
Returns:	MonobitTestResult – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.

coinflip.randtests.frequency_within_block(sequence, candidate=None, blocksize=8)[source]¶

Proportion of values per block is compared to expected 1:1 ratio

The difference between the frequency of the two values in each block is found, and referenced to a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested candidate (Value present in given sequence) – The value which is counted in each block blocksize (int) – Size of the blocks that partition the given series
Returns:	FrequencyWithinBlockTestResult – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.

coinflip.randtests.runs(sequence, candidate=None)[source]¶

Actual number of runs is compared to expected result

The number of runs (uninterrupted sequence of the same value) is found, and referenced to a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested candidate (Value present in given sequence) – The value which is counted in each block
Returns:	TestResult – Dataclass that contains the test’s statistic and p-value

coinflip.randtests.longest_runs(sequence, candidate=None)[source]¶

Longest runs per block is compared to expected result

The longest number of runs (uninterrupted sequence of the same value) per block is found, and referenced to a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested candidate (Value present in given sequence) – The value which is counted in each block
Returns:	TestResult – Dataclass that contains the test’s statistic and p-value

coinflip.randtests.binary_matrix_rank(sequence, candidate=None, matrix_dimen: Tuple[int, int] = None)[source]¶

Independence of neighbouring sequences is compared to expected result

Independence is determined by the matrix rank of a subsequence, where it is split into multiple rows to form a matrix. The counts of different rank bins is referenced to a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested candidate (Value present in given sequence) – The value which is counted in each block matrix_dimen (Tuple[int, int]) – Number of rows and columns in each matrix
Returns:	BinaryMatrixRankTestResult – Dataclass that contains the test’s statistic and p-value

coinflip.randtests.spectral(sequence, candidate=None)[source]¶

Potency of periodic features in sequence is compared to expected result

The binary values are treated as the peaks and troughs respectively of a signal, which is applied a Fourier transform so as to find constituent periodic features. The strength of these features is referenced to the expected potent periodic features present in a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested candidate (Value present in given sequence) – The value which is considered the peak in oscillations
Returns:	TestResult – Dataclass that contains the test’s statistic and p-value
Raises:	`NonBinaryTruncatedSequenceError` – When odd-lengthed sequence is truncated there is only one distinct value present

coinflip.randtests.non_overlapping_template_matching(sequence, template: List = None, nblocks=None)[source]¶

Matches of template per block is compared to expected result

The sequence is split into blocks, where the number of non-overlapping matches to the template in each block is found. This is referenced to the expected mean and variance in matches of a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested template (List, optional) – Template to match with the sequence, randomly generated if not provided. nblocks (int) – Number of blocks to split sequence into
Returns:	TestResult – Dataclass that contains the test’s statistic and p-value.
Raises:	`TemplateContainsElementsNotInSequenceError` – If template contains values not present in sequence

coinflip.randtests.overlapping_template_matching(sequence, template: List = None, nblocks=None, df=5)[source]¶

Overlapping matches of template per block is compared to expected result

The sequence is split into nblocks blocks, where the number of overlapping matches to the template in each block is found. This is referenced to the expected mean and variance in matches of a hypothetically truly random RNG.

Deprecated since version 0: df will be removed once I figure out the correct value, as I don’t quite understand what NIST wants (or if they’re even correct!)

Parameters:	sequence (array-like) – Output of the RNG being tested template (List, optional) – Template to match with the sequence, randomly generated if not provided. nblocks (int) – Number of blocks to split sequence into df (int, default 5) – Degrees of freedom to use in p-value calculation
Returns:	TestResult – Dataclass that contains the test’s statistic and p-value.
Raises:	`TemplateContainsElementsNotInSequenceError` – If template contains values not present in sequence

coinflip.randtests.maurers_universal(sequence, blocksize=None, init_nblocks=None)[source]¶

Distance between patterns is compared to expected result

Unique permutations in an initial sequence are identified, and the distances of aforementioned permutations in a remaining sequence are accumulated. The normalised value for the accumulated distances is then compared to a hypothetically truly random RNG.

Parameters:	sequence (array-like) – Output of the RNG being tested blocksize (int) – Size of the blocks that form a permutation init_nblocks (int) – Number of initial blocks to identify permutations
Returns:	TestResult – Dataclass that contains the test’s statistic and p-value

Warning

This section is for the private API of coinflip.randtests, intended for developers of randomness tests. If you wish to simply run randomness tests on your data (e.g. in a Juypter Notebook session), please use the public API as specified above.