Polynomial regression is a considered a special case of linear regression where higher order powers (x2, x3, etc.) of an independent variable are included. It's appropriate where your data may best be fitted to some sort of curve rather than a simple straight line. The polynomial module of numpy is easily used to explore fitting the best... Continue Reading →

# Co-variance, Correlation & Linear Regression

Typically we have 2 sets of values and we want to find out if these 2 sets of values are related, and if so how, and by how much? Could height be indicative of weight? Could hours of practice be related to how many errors are made in a mathematical test paper? Co-variance is a... Continue Reading →

# T testing – a worked example

A simple one-sample T-test This variant on hypothesis testing is used when you have limitations, specifically: The population standard deviation (σ) is unknown and your sample size (n) is <30 The fundamentals The formula is a variant of what we've seen thus far, where x̄ = your sample mean, μ = a hypothesized population mean,... Continue Reading →

# Proportion testing

Using everything we've learned so far about the central limit theorem, the z-score, and hypothesis testing, we can now also perform proportion testing! There are just a few new concepts to add into the mix: The preliminary terrors - notation & terminology p = the proportion of items that falls into H0 q = the... Continue Reading →

# Hypothesis testing basics

A simple example of hypothesis testing is where we know what "normal" is, and we want to evaluate whether some sample conforms to our understanding of "normal", or is so unusual that it's indicative of an actual shift in behaviour or pattern. Make your hypothesis statement If I…(do this to an independent variable)….then (this will... Continue Reading →

# Preliminary terrors of statistics

The "preliminary terrors", of course, being the notation as Silvanus P. Thompson so aptly described them :). Pronunciation μ sounds like "mew" σ sounds like "sigma" x̄ sounds like "x-bar" The population So we can think of this as the complete set of "things", whatever the "things" are that are under consideration - for example... Continue Reading →

# Central limit theorem – a worked example

Formal definition: Provided enough samples are taken, the sample distribution of the sample mean will be normally distributed, regardless of the population. Basically: With enough samples this also happens: Which ultimately allows us to calculate the Z-score: And using a Z-table, this allows us to find the probability of a value being <= x Here is... Continue Reading →