An example use case is this: you have a list of customers who have bought the various products that you sell. You want to know where the overlaps are, for example: How many customers who bought the Blue Widget also bought the Green Widget? Or what percentage of customers who bought the Blue Widget also... Continue Reading →

# Calculus rules to live by

Calculus is a big topic, but by and large, there are quite specific aspects of calculus that come into machine learning and in particular deep learning algorithms. This article is not intended to explain how and why things are as they are; rather it's my own personal cheat sheet for when I need to remember... Continue Reading →

# PyTorch Lightning – Regression Example

I find there are a lot of tutorials and toy examples on convolutional neural networks - so many ways to skin an MNIST cat! - but not so many on other types of scenarios. So I've decided to put together a quick sample notebook on regression using the bike-share dataset. After learning the basics of neural... Continue Reading →

# Data structures for deep learning

I recently completed the Udacity Deep Learning Nanodegree (highly worth doing by the way), which focuses on implementing a variety of deep learning architectures using PyTorch. At the outset, it's pretty fundamental to understand the data structures you'll be encountering as inputs to and outputs from your neural network architecture. What I noticed was that... Continue Reading →

# How to – KMeans clustering

Clustering is a type of unsupervised learning. Us humans would think of it as 'categorization' perhaps. For example, if I gave you a bag of red, blue and white balls and asked you to sort them (without telling you how) you would probably naturally gravitate towards sorting them by colour as this would be the... Continue Reading →

# How to – Principal Component Analysis

I always like to understand concepts well before I use them (which is good because it's the right thing to do, but bad because it slows me down a lot!), so it was with great excitement that I came across Matt Brems' article A One-Stop Shop for Principal Component Analysis recently. If you read this... Continue Reading →

# If you buy one book in 2020…

...make it Grokking Deep Learning by Andrew Trask! This gem of a book breaks deep learning down to its smallest component parts and then builds up your understanding from there. It's the equivalent of stripping your car down to nuts and bolts and then re-building it: at the end, you will know to a certainty... Continue Reading →

# Avoiding for loops in Pandas

There will be times when you are tempted to loop through rows or columns in Pandas to achieve your results - and the lesson I keep learning is Don't do it! Every time I'm tempted to write a for loop with Pandas data I find myself clock watching and cursing... 9 times out of 10 there... Continue Reading →

Regex can do amazing things with data cleanups - basically mandatory must use. But also tricky to retain in brain if not used frequently... Here are 3 great reference and test resources that can help: https://docs.python.org/3/howto/regex.html https://regexr.com/ https://regexone.com/