There will be times when you are tempted to loop through rows or columns in Pandas to achieve your results - and the lesson I keep learning is Don't do it! Every time I'm tempted to write a for loop with Pandas data I find myself clock watching and cursing... 9 times out of 10 there... Continue Reading →
Pandas dataframe styling – cool!
I always like to visualize data and see the detail if possible so it was with great joy that I stumbled across DataFrame.style this morning. Here is an example of how it helps us to visualize some Titanic survival rates by sex and passenger class: The Pandas documentation itself is pretty comprehensive, but if you're looking... Continue Reading →
Cheat sheet – Pandas, data manipulation
A few things I found I used over and over again the last few weeks - the basics... How it works - Pandas, data manipulation
Cheat sheet – Pandas, data selection
A quick cheat sheet on basic data selection functions - as an aid to memory until memory has become second nature 🙂 How it works - Pandas, data selection
OK so I know size isn't everything, and it may even be that there are way too many lines compared to what there should be BUT tonight I am feeling in a celebratory mood none-the-less. 6 months ago I hadn't coded anything ever, and tonight I've completed my 790 line "masterpiece" (don't laugh!) which actually... Continue Reading →
Pandas – get “these” based on “those”
Here is a common scenario for me: I have 2 related dataframes, and I want to select only the values from the one dataframe based on criteria from the other dataframe. For example, below we have people (including country) and brands (related to the people) and we want to select "only brands related to people who... Continue Reading →
A date with a difficult customer
Being an SAP consultant, I'm working with SAP data in Pandas quite a lot. It's common practice to set a "forever" date of 31.12.9999 in SAP and the system handles it, however Pandas has a limitation in this regard where the date cannot readily go beyond the year 2262 which caused me a few headaches... Continue Reading →
A Multi-index conundrum
A Pandas series can have more than one index value. I discovered this fact during the week when I found the .groupby() method which was going to be very useful to me in getting a count of values. I started with data similar to this and my objective was to get a summary of numbers... Continue Reading →
If you enjoyed VLOOKUP…
I've always enjoyed VLOOKUP in Excel so it was with great joy that I learned some new methods in Pandas this week. So much more flexible & powerful - what a pleasure! .map() For example, you have a list of names in one series and a list of clothing sizes in a second series: To... Continue Reading →
I love this Pandas help page! I feel like I can do anything with data suddenly... Learn to merge, join, concatenate data