statistics – Sho't left to data science

Limit theorems explained

Mar 6, 2025

—

by

Before we dive into the theorems let’s tackle a concept one often sees in statistics: the notion of independent, identically distributed (iid) random variables. Whether we’re drawing a sample from a population or conducting a series of experiments like coin flips, we can assess whether iid holds true or not as follows: Independent? Here we…

Poisson vs Exponential distributions

Aug 9, 2018

—

by

shotlefttodatascience

in postcards

These distributions are related yet different – here’s a comparison that hopefully clears up any confusions! Poisson Exponential Number of events that occur in an interval of time Time taken between 2 events occurring For example… the number of Metrorail trains that arrive at the platform in an hour For example… the time between one…

Law of total probability – worked examples

Aug 5, 2018

—

by

shotlefttodatascience

in statistics

According to Wikipedia the law of total probability “expresses the total probability of an outcome which can be realized via several distinct events”. We can also think of this as the marginal probability: irrespsective of what road we took to get to this outcome, what is the total likelihood of the outcome occurring? Example 1…

Expected value refresher

Aug 4, 2018

—

by

shotlefttodatascience

in statistics

The expected value of an event is its most likely outcome. Assign each potential result a probability. The expected value is sum of all the potential results x their respective probabilities: ∑ (potential_result1 x probability1,… potential_resultn x probabilityn) Consider the simplest example possible, the coin flip. You’ll be paid R10 if you pick tails, but…

Polynomial regression

May 4, 2018

—

by

shotlefttodatascience

in machine learning, mathematics, python

Polynomial regression is a considered a special case of linear regression where higher order powers (x2, x3, etc.) of an independent variable are included. It’s appropriate where your data may best be fitted to some sort of curve rather than a simple straight line. The polynomial module of numpy is easily used to explore fitting the best…

Co-variance, Correlation & Linear Regression

May 3, 2018

—

by

shotlefttodatascience

in machine learning, mathematics, python, statistics

Typically we have 2 sets of values and we want to find out if these 2 sets of values are related, and if so how, and by how much? Could height be indicative of weight? Could hours of practice be related to how many errors are made in a mathematical test paper? Co-variance is a…

T testing – a worked example

Feb 8, 2018

—

by

shotlefttodatascience

in statistics

A simple one-sample T-test This variant on hypothesis testing is used when you have limitations, specifically: The population standard deviation (σ) is unknown and your sample size (n) is <30 The fundamentals The formula is a variant of what we’ve seen thus far, where x̄ = your sample mean, μ = a hypothesized population mean,…

Proportion testing

Feb 6, 2018

—

by

shotlefttodatascience

in statistics

Using everything we’ve learned so far about the central limit theorem, the z-score, and hypothesis testing, we can now also perform proportion testing! There are just a few new concepts to add into the mix: The preliminary terrors – notation & terminology p = the proportion of items that falls into H0 q = the…

Hypothesis testing basics

Feb 5, 2018

—

by

shotlefttodatascience

in mathematics, statistics

A simple example of hypothesis testing is where we know what “normal” is, and we want to evaluate whether some sample conforms to our understanding of “normal”, or is so unusual that it’s indicative of an actual shift in behaviour or pattern. Make your hypothesis statement If I…(do this to an independent variable)….then (this will…

Preliminary terrors of statistics

Feb 2, 2018

—

by

shotlefttodatascience

in mathematics

The “preliminary terrors“, of course, being the notation as Silvanus P. Thompson so aptly described them :). Pronunciation μ sounds like “mew” σ sounds like “sigma” x̄ sounds like “x-bar” The population So we can think of this as the complete set of “things”, whatever the “things” are that are under consideration – for example…

Tag: statistics