randtests¶
Statistical tests for randomness
This sub-package comprises of implementations of statistical tests laid out in a comprehensive paper from NIST 1 in regards to assessing (pseudo-)random number generators.
Notes
A copy of the NIST paper can be found at the root of the coinflip
repository as
SP800-22.pdf.
The test themselves are defined in section 2., “Random Number Generation Tests”, p. 23-62. Further detail of the tests are provided in section 3. “Technical Descriptions of Tests”, p. 63-87.
These tests were implemented in a complimentary program sts
, which can be
downloaded from the NIST website.
Note that the paper assumes a great amount of familiarity with certain
concepts in statistics. It also uses some constants and algorithms without any
explanations. Part of the purpose for coinflip
is to “describe” the NIST tests
more wholly than in the paper itself, whilst also reducing the noise of some
non-idiomatic programming conventions used in sts
.
- 1
National Institute of Standards and Technology <Andrew Rukhin, Juan Soto, James Nechvatal, Miles Smid, Elaine Barker, Stefan Leigh, Mark Levenson, Mark Vangel, David Banks, Alan Heckert, James Dray, San Vo, Lawrence E Bassham II>, “A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications”, Special Publication 800-22 Revision 1a, April 2010.
-
coinflip.randtests.
monobit
(sequence)[source]¶ Proportion of values is compared to expected 1:1 ratio
The difference between the occurrences of the two values in the sequence is found, and referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
- Returns
result (
MonobitTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
frequency_within_block
(sequence, blocksize: Optional[int] = None)[source]¶ Proportion of values per block is compared to expected 1:1 ratio
The sequence is split into blocks, and the difference between the occurrences of the two values in the sequence is found. This is referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
blocksize (
int
) – Size of the blocks that partition the given sequence
- Returns
result (
FrequencyWithinBlockTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
runs
(sequence)[source]¶ Number of runs is compared to expected result
The number of runs (uninterrupted sequence of the same value) is found, and referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
- Returns
result (
RunsTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
longest_runs
(sequence)[source]¶ Longest runs per block is compared to expected result
The sequence is split into blocks, where the longest number of runs (uninterrupted sequence of the same value) in each block is found. This is referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
- Returns
result (
LongestRunsTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
binary_matrix_rank
(sequence, matrix_dimen: Optional[Tuple[int, int]] = None)[source]¶ Independence of neighbouring sequences is compared to expected result
The sequence is split into matrices, where the rank of each matrix is found to determine independence. The counts of different rank bins is referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
matrix_dimen (
Tuple[int, int]
) – Number of rows and columns in each matrix
- Returns
result (
BinaryMatrixRankTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
spectral
(sequence)[source]¶ Potency of periodic features in sequence is compared to expected result
The sequence is treated as a signal, which is applied a Fourier transform so as to find constituent periodic features. The strength of these features is referenced to the expected periodic features present in a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
- Returns
result (
SpectralTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.- Raises
NonBinaryTruncatedSequenceError – When odd-lengthed sequence is truncated there is only one distinct value present
-
coinflip.randtests.
non_overlapping_template_matching
(sequence, template_size: Optional[int] = None, blocksize: Optional[int] = None)[source]¶ Matches to template per block is compared to expected result
The sequence is split into blocks, where the number of non-overlapping pattern matches to the template in each block is found. This is referenced to the expected mean and variance in matches of a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
template_size (
int
) – Size of all the templates to be generatedblocksize (
int
) – Size of the blocks that partition the given sequence
- Returns
results (
NonOverlappingTemplateMatchingMultiTestResult
) – Dataclass that contains the statistics and p-values of all sub-tests, as well as other relevant information gathered.
-
coinflip.randtests.
overlapping_template_matching
(sequence, template_size: Optional[int] = None, blocksize: Optional[int] = None, matches_ceil: Optional[int] = None)[source]¶ Overlapping matches to template per block is compared to expected result
The sequence is split into blocks, where the number of overlapping patterns matches to the template in each block is found. This is referenced to the expected mean and variance in matches of a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
template_size (
int
) – Size of the template to be generatedblocksize (
int
) – Size of the blocks that partition the given sequencematches_ceil (
int
) – Group matches of this value and higher as one single tally
- Returns
result (
OverlappingTemplateMatchingTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
maurers_universal
(sequence, blocksize: Optional[int] = None, init_nblocks: Optional[int] = None)[source]¶ Distance between patterns is compared to expected result
The distinct patterns in an initial sequence are identified, and the distances between subsequent occurences of these patterns in a remaining sequence are accumulated. The normalised value for the accumulated distances is referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
blocksize (
int
) – Size of the patternsinit_nblocks (
int
) – Number of initial blocks to identify patterns
- Returns
result (
UniversalTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
linear_complexity
(sequence, blocksize: Optional[int] = None)[source]¶ LSFRs of blocks is compared to expected length
The sequence is split into blocks, where the shortest linear feedback shift register is found for each block. The difference of each LSFR’s length to the expected mean length is binned, and is referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
blocksize (
int
) – Size of the blocks
- Returns
results (
LinearComplexityTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
serial
(sequence, blocksize: Optional[int] = None)[source]¶ Proportion of all overlapping patterns is compared to expected uniformity
The number of overlapping pattern matches for each distinct pattern is found. This is referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
blocksize (
int
) – Size of the patterns
- Returns
results (
SerialMultiTestResult
) – Dataclass that contains the statistics and p-values of all sub-tests, as well as other relevant information gathered.
-
coinflip.randtests.
approximate_entropy
(sequence, blocksize: Optional[int] = None)[source]¶ Approximate entropy of sequence is compared to expected result
The approximate entropy of the sequence is found and referenced to a truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
blocksize (
int
) – Size of the patterns
- Returns
results (
ApproximateEntropyTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
cusum
(sequence, reverse: bool = False)[source]¶ Furthest detour in a randomn walk is compared to expected result
The sequence is treated as a random walk, where the furthest detour from the axis is identified and referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
reverse (
bool
) – Cumulate sums from the end of the sequence first.
- Returns
results (
CusumTestResult
) – Dataclass that contains the test’s statistic and p-value as well as other relevant information gathered.
-
coinflip.randtests.
random_excursions
(sequence)[source]¶ Frequency of states per cycle in a random walk is compared to expected results
The sequence is treated as a random walk, where the frequency of states -4 to 4 in each cycle is found. This is referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
- Returns
results (
RandomExcursionsMultiTestResult
) – Dataclass that contains the statistics and p-values of all sub-tests, as well as other relevant information gathered.
-
coinflip.randtests.
random_excursions_variant
(sequence)[source]¶ Proportion of states in a random walk is compared to expected results
The sequence is treated as a random walk, where the occurrences of states -9 to 9 is founded and referenced to a hypothetically truly random sequence.
- Parameters
sequence (array-like with two distinct values) – Sequence containing 2 distinct elements
- Returns
results (
RandomExcursionsVariantMultiTestResult
) – Dataclass that contains the statistics and p-values of all sub-tests, as well as other relevant information gathered.
-
exception
coinflip.randtests.exceptions.
TestError
¶ Base class for test-related errors
-
exception
coinflip.randtests.exceptions.
TestNotImplementedError
¶ Error if test is not implemented to handle valid parameters
-
exception
coinflip.randtests.exceptions.
TestInputError
¶ Error if test cannot handle (invalid) parameters
-
exception
coinflip.randtests.exceptions.
NonBinarySequenceError
¶ Error if sequence does not contain only 2 distinct values