I always like to understand concepts well before I use them (which is good because it’s the right thing to do, but bad because it slows me down a lot!), so it was with great excitement that I came across Matt Brems’ article A One-Stop Shop for Principal Component Analysis recently.
If you read this article I promise you will get it :). Once I’d read it I was inspired to create myself a notebook for future reference so that I would be able to see the theory in action and understand some use cases and how to implement them, so the main purpose of this notebook is to:
- look at some examples to illustrate the difference between feature selection and feature extraction, using a simple scenario with the iris dataset taken from the sklearn documentation on PCA
- and then to look at a practical application of PCA from the world of NLP (visualizing high-dimensional vectors in 3D space) using GloVe word embeddings
Remember, whether we’re talking feature selection or feature extraction, our goal is to take a dataset that has many variables and reduce it to a dataset that has fewer variables but remains strongly representative of our data.