Related yet different, here's how... A quick note on the "preliminary terrors" of notation: e is Euler's number - you'll find the e on your calculator or the EXP() function in Excel The parameter is conventionally written as λ (pronounced lambda). Poisson Exponential Number of events that occur in an interval of time Time taken between 2... Continue Reading →
Law of total probability – worked examples
According to Wikipedia the law of total probability "expresses the total probability of an outcome which can be realized via several distinct events". We can also think of this as the marginal probability: irrespsective of what road we took to get to this outcome, what is the total likelihood of the outcome occurring? Example 1... Continue Reading →
Expected value refresher
The expected value of an event is its most likely outcome. Assign each potential result a probability. The expected value is sum of all the potential results x their respective probabilities: ∑ (potential_result1 x probability1,… potential_resultn x probabilityn) Consider the simplest example possible, the coin flip. You'll be paid R10 if you pick tails, but... Continue Reading →
Polynomial regression
Polynomial regression is a considered a special case of linear regression where higher order powers (x2, x3, etc.) of an independent variable are included. It's appropriate where your data may best be fitted to some sort of curve rather than a simple straight line. The polynomial module of numpy is easily used to explore fitting the best... Continue Reading →
Co-variance, Correlation & Linear Regression
Typically we have 2 sets of values and we want to find out if these 2 sets of values are related, and if so how, and by how much? Could height be indicative of weight? Could hours of practice be related to how many errors are made in a mathematical test paper? Co-variance is a... Continue Reading →
T testing – a worked example
A simple one-sample T-test This variant on hypothesis testing is used when you have limitations, specifically: The population standard deviation (σ) is unknown and your sample size (n) is <30 The fundamentals The formula is a variant of what we've seen thus far, where x̄ = your sample mean, μ = a hypothesized population mean,... Continue Reading →
Proportion testing
Using everything we've learned so far about the central limit theorem, the z-score, and hypothesis testing, we can now also perform proportion testing! There are just a few new concepts to add into the mix: The preliminary terrors - notation & terminology p = the proportion of items that falls into H0 q = the... Continue Reading →
Hypothesis testing basics
A simple example of hypothesis testing is where we know what "normal" is, and we want to evaluate whether some sample conforms to our understanding of "normal", or is so unusual that it's indicative of an actual shift in behaviour or pattern. Make your hypothesis statement If I…(do this to an independent variable)….then (this will... Continue Reading →
Preliminary terrors of statistics
The "preliminary terrors", of course, being the notation as Silvanus P. Thompson so aptly described them :). Pronunciation μ sounds like "mew" σ sounds like "sigma" x̄ sounds like "x-bar" The population So we can think of this as the complete set of "things", whatever the "things" are that are under consideration - for example... Continue Reading →
Central limit theorem – a worked example
Formal definition: Provided enough samples are taken, the sample distribution of the sample mean will be normally distributed, regardless of the population. Basically: With enough samples this also happens: Which ultimately allows us to calculate the Z-score: And using a Z-table, this allows us to find the probability of a value being <= x Here is... Continue Reading →