An example use case is this: you have a list of customers who have bought the various products that you sell. You want to know where the overlaps are, for example:
- How many customers who bought the Blue Widget also bought the Green Widget?
- Or what percentage of customers who bought the Blue Widget also bought the Green Widget?
- … Or vice versa?
You have a potentially large number of products and a gazillion customers that bought them. Really this is just a bit of a data manipulation problem (it took me a while to figure out how to get that part right, hence the post – as an aid to the memory of my future self ♥), followed by a nice Seaborn heatmap visualization.
As a matter of interest, I first thought of using UpSetPlot to do this, and it looks very nice and worked well on a toy example, but unfortunately for my use-case, I ran into a limitation on the number of records I have, and the combinations were too numerous so I had to come up with an alternative.
You can head on over to the notebook here.