Hoeffding’s Inequality
If we pick 20 random numbers from the [0..1] by means of a random number generator with any distribution, e.g. with python numpy.random.random(20), then the probability that the average of the sample (20 numbers) would deviate by more than 1/4 from the theoretical mean equal is less 0.16. In our case t=1/4, n=20, if we […]
Generator of True Random Numbers Based on Quantum Fluctuations of Vacuum
All random number generated by SW are in virtue pseudo-random, because a deterministic machine can’t produce pure random numbers. In the laboratory of physics https://qrng.anu.edu.au/ , quantum fluctuations of vacuum are ceaselessly measured to supply true random numbers for anyone in the world. To get true random numbers from Python you need install ‘quantumrandom’ module: […]
What’s the probability for two randomly selected integers to be relatively prime?
What’s the probability for two randomly selected integers to be relatively prime (coprime)? The answer is 6/π2 ≈ 0.6, a boring proof you can find here Practically speaking, if you choose randomly two integers then the probability that these integers are relatively prime is about 60%. Slava23+ years’ programming and theoretical experience in the computer […]
Exercise in Combinatorics for Team Leader
Let’s imagine you are a team leader (for me it’s really difficult, i’ve never been a manager). Your team consists of 5 different engineers (if you are a good leader, you must see that the engineers are mutually different, otherwise you ought to resign). You just have obtained 10 different tasks to accomplish in a […]
Detection Anomalies (Outliers) in Data
Content Z-score Method Tukey’s Method Outlier Ratio Appendix: Coefficient of Variation (CV) Z-score method If data is distributed normally then we can use z-score method (or the three-sigma rule) to detect outliers (values exceeding 3 std deviations in both directions are considered as anomalies). The following python snippet demonstrates how to detect anomalies […]
Software Testing and Statistics (simplified cases from real practice)
1) We often test programs which fails or get stuck occasionally (e.g. due to a race condition). Let’s suppose that after a specific fix the failure rate of the program has been reduced – how can we be sure with a high confidence this fix indeed reduces the failure rate and not the result of […]
Use Case from Video Compression: Significance Testing of Pearson Correlation Coefficient
My purpose of paper is to show that sometimes correlation between parameters can be by chance and not by causality. Definition: Pearson correlation coefficient (usually called merely ‘correlation coefficient’) measures a linear relationship between two variables. It’s worth noticing Pearson correlation coefficient reflects an extent of linear relationship (not exponential or quadratic relationship). The range of the […]
How Many Calls of Random Generator to Get Sequence of Different Numbers?
Problem: We have a random generator generating uniformly distributed numbers in the range 1..10 . How many times in average one can apply the random generator to get a sequence of three mutually different numbers? Solution [The problem is similar to the well-known Collecting Coupons Problem] Let’s suppose that the random generator is a sampling […]
SW Testing and Population Statistics
We often test programs which fails or get stuck (e.g. due to racing conditions). Let’s suppose that after a fix the failure rate is reduced – how can we be sure with a high confidence the fix indeed reduces the failure rate? This question is equivalent to very famous statistics problem – tossing of coin. […]
Probability Theory and SW Verification
Use cases from real problems in simplistic form You are going to verify a SW product by random tests, i.e. you activate a random number generator producing pseudo-random numbers (“pseudo” is not a typo, a deterministic machine can’t produce pure random numbers) say in the range [1..100]. You pick the first 10 random numbers […]