Author: shotlefttodatascience

  • Why Linux / command line?

    Today I’m ploughing my way through the edX Introduction to Linux and found myself wondering “whyyyy??”. It’s all quite entertaining and nifty, but how will I be using this later? Naturally I googled (using bing!) and found this wonderful online resource www.datascienceatthecommandline.com. After reading the section entitled “A Real-world Use Case” involving data for New York…

  • Getting results vs Understanding

    Alexander Pope is famously quoted as saying: A little learning is a dangerous thing; drink deep, or taste not the Pierian spring: there shallow draughts intoxicate the brain, and drinking largely sobers us again. I’ve been thinking about these words the past few days as I worked on my latest challenge: a text classifier using…

  • Thoughts on keeping on going

    I had an email today from a reader (“K”) who tried the Learn Python Challenge on Kaggle and (as does happen!) got to about Day 4 and then abandoned ship and went in search of more new stuff… The question is how to keep on going – when you get stuck, bored, de-motivated or perhaps even…

  • Learn Python Challenge on Kaggle

    I signed up for this 7-day challenge to test my knowledge, and it’s been an absolute delight! As a newbie, when I find myself on StackOverflow reading discussions about “the most Pythonic way” to do something, I usually feel a bit left out… I’ll just be happy if I can do it any darned way…

  • DSI – Beyond Excited!

    So stoked to be attending Data Science Intensive later this year – “first of its kind in Africa”: 8 weeks solving real world problems, I can hardly wait :).

  • My magician’s wand works!

    This week I’m literally feeling like a magician! My first real classifier attempt: with a month’s worth of emails to the Service Desk, and sklearn.naive_bayes ,I can tell to a 96% certainty which incidents should be assigned to Team A and which to Team B. MAGIC!

  • Crontab – how to

    As a newbie, I’ve been receiving files via email, copying them to my Jupyter Notebook folder, running my script, emailing the resulting outputs back to my customer. As a prospective data scientist I’ve been feeling positively embarrassed about this ridiculously low-tech process! Thanks to my colleagues Shaun and Christine, I’ve been set onto the path…

  • Small simple datasets for practising

    It’s all very well downloading complex datasets from Kaggle and similar sources to play with – they’re amazing for learners because the data is always less clean than you would have hoped, more complex than you anticipated, and every bit as interesting as promised. BUT if you’re learning a new concept it’s easier to have…

  • Adding labels to districts in GeoPandas

    Once you have your districts drawn up nicely, using the polygons from your shapefile, it would be useful to be able to label them – but of course you need to be able to tell GeoPandas where to place these labels via co-ordinates or points – and in your shapefile you only have polygons which…

  • GeoPandas – a detailed example

    Dear World, Please send me more geographical data to plot so I can keep on using GeoPandas… Love from Sho’t Left I can’t believe how much fun this library is! So my goal was to find a way to map assessment ratings by region, showing the overall result for the region, as well as the…