Author: shotlefttodatascience
-
Pandas – get “these” based on “those”
Here is a common scenario for me: I have 2 related dataframes, and I want to select only the values from the one dataframe based on criteria from theΒ other dataframe. For example, below we have people (including country) and brands (related to the people) and we want to select “only brands related to people who…
-
Is it normal to get lost in your own code?
Is it normal to get lost in your own code? This week I am busy with a “real life” project (yay!) – to help me classify SAP users according to system usage and ultimately license type. The totally weird thing I’m finding is that I figure out the right way to code something so it…
-
Break it down
Coding thought for the day: if it’s not working, break it down to its smallest component parts and you will soon find out why π
-
The miracle of Stackoverflow
Is there anything Stackoverflow can’t answer? What a brilliant resource π
-
A date with a difficult customer
Being an SAP consultant, I’m working with SAP data in Pandas quite a lot. It’s common practice to set a “forever” date of 31.12.9999 in SAP and the system handles it, however Pandas has a limitation in this regard where the date cannot readily go beyond the year 2262 which caused me a few headaches…
-
A Multi-index conundrum
A Pandas series can have more than one index value. I discovered this fact during the week when I found the .groupby() method which was going to be very useful to me in getting a count of values. I started with data similar to this and my objective was to get a summary of numbers…
-
If you enjoyed VLOOKUP…
I’ve always enjoyed VLOOKUP in Excel so it was with great joy that I learned some new methods in Pandas this week. So much more flexible & powerful – what a pleasure! .map() For example, you have a list of names in one series and a list of clothing sizes in a second series: To…
-
Central limit theorem – a worked example
Remember our formal definition: The CLT states that, provided enough samples are taken, the sample distribution of the sample mean will be normally distributed, regardless of the population distribution. In mathematical terms we say therefore that the sample mean is equal to the population mean: With enough samples this also happens – the sample standard deviation…
-
Merge, join and concatenate
I love this Pandas help page! I feel like I can do anything with data suddenly… Learn to merge, join, concatenate data
