There will be times when you are tempted to loop through rows or columns in Pandas to achieve your results - and the lesson I keep learning is Don't do it! Every time I'm tempted to write a for loop with Pandas data I find myself clock watching and cursing... 9 times out of 10 there... Continue Reading →
I picked Center for Policing Equity challenge on Kaggle for three reasons: I love maps and I love the idea that data scientists can significantly improve our world, in addition to improving the bottom lines of big corporates. And this is exactly the type of messy data one would get in the real world so... Continue Reading →
I always like to visualize data and see the detail if possible so it was with great joy that I stumbled across DataFrame.style this morning. Here is an example of how it helps us to visualize some Titanic survival rates by sex and passenger class: The Pandas documentation itself is pretty comprehensive, but if you're looking... Continue Reading →
Alexander Pope is famously quoted as saying: A little learning is a dangerous thing; drink deep, or taste not the Pierian spring: there shallow draughts intoxicate the brain, and drinking largely sobers us again. I've been thinking about these words the past few days as I worked on my latest challenge: a text classifier using... Continue Reading →
I signed up for this 7-day challenge to test my knowledge, and it's been an absolute delight! As a newbie, when I find myself on StackOverflow reading discussions about "the most Pythonic way" to do something, I usually feel a bit left out... I'll just be happy if I can do it any darned way... Continue Reading →
This week I'm literally feeling like a magician! My first real classifier attempt: with a month's worth of emails to the Service Desk, and sklearn.naive_bayes ,I can tell to a 96% certainty which incidents should be assigned to Team A and which to Team B. MAGIC!
As a newbie, I've been receiving files via email, copying them to my Jupyter Notebook folder, running my script, emailing the resulting outputs back to my customer. As a prospective data scientist I've been feeling positively embarrassed about this ridiculously low-tech process! Thanks to my colleagues Shaun and Christine, I've been set onto the path... Continue Reading →
It's all very well downloading complex datasets from Kaggle and similar sources to play with - they're amazing for learners because the data is always less clean than you would have hoped, more complex than you anticipated, and every bit as interesting as promised. BUT if you're learning a new concept it's easier to have... Continue Reading →