Hoeffding’s Inequality

If we pick 20 random numbers from the [0..1] by means of a random number generator with any distribution, e.g. with python numpy.random.random(20), then the probability that the average of the sample (20 numbers) would deviate by more than 1/4 from the theoretical mean equal is less 0.16. In our case t=1/4, n=20, if we […]

Generator of True Random Numbers Based on Quantum Fluctuations of Vacuum

All random number generated by SW are in virtue pseudo-random, because a deterministic machine can’t produce pure random numbers. In the laboratory of physics https://qrng.anu.edu.au/  ,  quantum fluctuations of vacuum are ceaselessly measured to supply true random numbers for anyone in the world. To get true random numbers from Python you need install ‘quantumrandom’ module: […]

What’s the probability for two randomly selected integers to be relatively prime?

What’s the probability for two randomly selected integers to be  relatively prime (coprime)? The answer is   6/π2 ≈ 0.6, a boring proof you can find here Practically speaking, if you choose randomly  two integers then the probability that these integers are relatively prime is about 60%. Slava23+ years’ programming and theoretical experience in the computer […]

Exercise in Combinatorics for Team Leader

Let’s imagine you are a team leader (for me it’s really difficult, i’ve never been a manager).  Your team consists of 5 different engineers (if you are a good leader, you must see that the engineers are mutually different, otherwise you ought to resign). You just have obtained 10 different tasks to accomplish in a […]

Detection Anomalies (Outliers) in Data

Content Z-score Method Tukey’s Method Outlier Ratio Appendix: Coefficient of Variation (CV)     Z-score method If data is distributed normally then we can use z-score method (or the three-sigma rule) to detect outliers (values exceeding 3 std deviations in both directions are considered as anomalies). The following python snippet demonstrates how to detect anomalies […]

Software Testing and Statistics (simplified cases from real practice)

1) We often test programs which fails or get stuck  occasionally (e.g. due to a race condition). Let’s suppose that after a specific fix the failure rate of the program has been reduced – how can we be sure with a high confidence this fix indeed reduces the failure rate and not the result of […]

Use Case from Video Compression: Significance Testing of Pearson Correlation Coefficient

My purpose of paper is to show that sometimes correlation between parameters can be by chance and not by causality.   Definition: Pearson correlation coefficient (usually called merely ‘correlation coefficient’) measures a linear relationship between two variables. It’s worth noticing Pearson correlation coefficient reflects an extent of linear relationship (not exponential or quadratic relationship). The range of the […]

How Many Calls of Random Generator to Get Sequence of Different Numbers?

Problem: We have a random generator generating uniformly distributed numbers in the range 1..10 .  How many times in average one can apply the random generator to get a sequence of three mutually different numbers? Solution [The problem is similar to the well-known Collecting Coupons Problem] Let’s suppose that the random generator is a sampling […]

SW Testing and Population Statistics

We often test programs which fails or get stuck (e.g. due to racing conditions). Let’s suppose that after a fix the failure rate is reduced – how can we be sure with a high confidence the fix indeed reduces the failure rate? This question is equivalent to very famous statistics problem – tossing of coin. […]

Probability Theory and SW Verification

Use cases from real problems in simplistic form   You are going to verify a SW product by random tests, i.e. you activate a random number generator producing pseudo-random numbers (“pseudo” is not a typo, a deterministic machine can’t produce pure random numbers) say in the range [1..100]. You pick the first 10 random numbers […]