Author: shotlefttodatascience
-
Computational literary analysis
Introduction The inspiration for this project (which I completed in early 2022), was a call for applications to UC Berkeley that I came across on the topic of NLP for computational literary analysis and specifically how one might develop computational models for the plot of a novel. The brief suggests that the concept of ‘plot’…
-
Improving customer satisfaction using Bayesian networks
Specification Background A typical IT support environment is governed by service level agreements (SLA’s) that define expected levels of service in terms of a variety of metrics such as ‘minimum first response time’ or ‘maximum resolution time’ for each ticket logged. However, adherence to these basic standards does not necessarily result in customer satisfaction. An…
-
Predicting churn with PySpark
I decided to tackle the Expresso churn prediction challenge on the Zindi platform during the course of the Big data analysis module of my degree for a couple of reasons: The full project can be viewed in my Github repo: The Expresso brief According to Zindi “Expresso is an African telecommunications company that provides customers…
-
Using human-in-the-loop techniques
Many machine-learning tasks rely on the availability of a labelled dataset for training and tuning. But how do we go about evaluation when the dataset we have is not labelled? This is exactly the situation I found myself facing during my final MSc project. I chose to experiment with building a knowledge graph from news…
-
My thesis in a podcast
I knew that, once completed, people might ask me about the research I did for my thesis, or be interested to know more. At the same time, academic writing can be very dry so I was pondering more accessible ways to present the material. I came across Hannah Fry’s Google DeepMind podcast where she interviews…
-
Studying data science through the University of London
I came across the University of London’s MSc Data Science program towards the end of 2020. At the time my Dad was fighting off Covid – and because I had also been exposed, I was quarantined with him and my Mom for three weeks while we waited to see whether we might also have contracted…
-
Dealing with impostor syndrome
I’m afraid that I do not write this article from the standpoint of having cracked the problem! But I do have some thoughts, which I’m jotting down here – notes to my future self when impostor syndrome rears its ugly head, as it surely will! Impostor syndrome is a psychological phenomenon where, despite often overwhelming…
-
Can I get there from here?
This started out as an experimental journey… I had already transitioned from music to teaching to SAP consulting. Would it be possible for the next leg of the journey to take me into the world of data science? At the outset I was not at all sure, but in the spirit of taking a sho’t left, I made a start, just…
-
The data science “antilibrary”
I first came across the notion of the “antilibrary” in Maria Popova’s beautiful post reflecting on “Why Unread Books Are More Valuable to Our Lives than Read Ones“. The term was coined by Nassim Nicholas Taleb (author of The Black Swan) who suggests that as your knowledge grows, so too should your accumulation of unread…
-
Test
Lorem ipsum odor amet, consectetuer adipiscing elit. Viverra est semper luctus felis pharetra. Vitae risus blandit ligula tincidunt nullam purus. Inceptos fusce condimentum id primis tempor luctus volutpat eget. Praesent venenatis molestie sollicitudin massa posuere interdum. Cursus rutrum praesent est mauris pulvinar ultrices vel. Platea nulla lectus bibendum habitant mattis nascetur magna potenti. Diam habitant…
