Visualizing data overlaps

example

An example use case is this: you have a list of customers who have bought the various products that you sell. You want to know where the overlaps are, for example:

  • How many customers who bought the Blue Widget also bought the Green Widget?
  • Or what percentage of customers who bought the Blue Widget also bought the Green Widget?
  • … Or vice versa?

You have a potentially large number of products and a gazillion customers that bought them. Really this is just a bit of a data manipulation problem (it took me a while to figure out how to get that part right, hence the post – as an aid to the memory of my future self  ♥), followed by a nice Seaborn heatmap visualization.

As a matter of interest, I first thought of using UpSetPlot to do this, and it looks very nice and worked well on a toy example, but unfortunately for my use-case, I ran into a limitation on the number of records I have, and the combinations were too numerous so I had to come up with an alternative.

You can head on over to the notebook here.

Comments are closed.

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: