I picked Center for Policing Equity challenge on Kaggle for three reasons: I love maps and I love the idea that data scientists can significantly improve our world, in addition to improving the bottom lines of big corporates. And this is exactly the type of messy data one would get in the real world so... Continue Reading →

# Pandas dataframe styling – cool!

I always like to visualize data and see the detail if possible so it was with great joy that I stumbled across DataFrame.style this morning. Here is an example of how it helps us to visualize some Titanic survival rates by sex and passenger class: The Pandas documentation itself is pretty comprehensive, but if you're looking... Continue Reading →

Regex can do amazing things with data cleanups - basically mandatory must use. But also tricky to retain in brain if not used frequently... Here are 3 great reference and test resources that can help: https://docs.python.org/3/howto/regex.html https://regexr.com/ https://regexone.com/

I've just discovered the awesome Brandon Rohrer and his blog while trying to find an intelligible article on Bayesian inference. What a goldmine - this guy is a born educator! Thank you for sharing your knowledge - it is well-appreciated!

# SQL CheatSheet

I've just worked through Imtiaz Ahmad's Master SQL for Data Science on Udemy and it was a thoroughly enjoyable, morale-boosting experience! He build on each concept so you never feel left behind or perplexed at how he arrived at a solution, and as promised there are a gazillion exercises so by the time you're done you feel like... Continue Reading →

# Poisson vs Exponential distributions

Related yet different, here's how... A quick note on the "preliminary terrors" of notation: e is Euler's number - you'll find the e on your calculator or the EXP() function in Excel The parameter is conventionally written as λ (pronounced lambda). Poisson Exponential Number of events that occur in an interval of time Time taken between 2... Continue Reading →

# A Scrum fan is born… cheatsheet

I've just finished listening to The Art of Doing Twice the Work in Half the Time on Audible and I feel like a real fan already - I can't wait to test-drive it in a team situation! As a stalwart of Corporate IT, I've only ever worked according to the "waterfall" methodology and I'm unpleasantly... Continue Reading →

# Law of total probability – worked examples

According to Wikipedia the law of total probability "expresses the total probability of an outcome which can be realized via several distinct events". We can also think of this as the marginal probability: irrespsective of what road we took to get to this outcome, what is the total likelihood of the outcome occurring? Example 1... Continue Reading →

# Expected value refresher

The expected value of an event is its most likely outcome. Assign each potential result a probability. The expected value is sum of all the potential results x their respective probabilities: ∑ (potential_result1 x probability1,… potential_resultn x probabilityn) Consider the simplest example possible, the coin flip. You'll be paid R10 if you pick tails, but... Continue Reading →