Emerging from Data Science Intensive


In September & October I was fortunate enough to attend the Data Science Intensive Program (DSI) in Cape Town. In a word: WOW!

The program brought together 16 students from 7 African countries for 8 very intensive weeks with an ambitious goal:

To ensure that anyone who completes the DSI is able to contribute significant value to any data science team in the world.

That sounds almost crazy, right? I mean how can you do that in 8 weeks? And yet… done it was! Here are some of the key reasons that made it possible – which I also consider a great “lessons learned” list for working in the real world:

The power of teams

Every two weeks we worked on delivering a specific project in teams of four. Being from diverse backgrounds (mathematicians, statisticians, astronomers, programmers, business people…) each team member came with their own strengths and weaknesses.

I learned at least as much from my peers as I did from our lecturers! And because each team experience was different, I also got a great sense of what works AND what doesn’t work.

My best team experience was where everyone bought into SCRUM and shared information regularly with a real teaching mindset (“I just found an awesome way to code this, would you like to see?” or “I’m totally stuck, I don’t understand what I’m doing?”). By the last day it was literally like the whole was greater than the sum of the parts: we were in the zone and it showed in our results!

By contrast I also saw what can happen when team members function in isolation: again, the results showed and I personally learned a valuable lesson about the need to speak early in the face of discomfort instead of adopting a “wait-and-see” approach. It also showed me that you may be clever, but if you work alone you don’t necessarily do as well as a highly motivated group.

Quality information

The DSI was not about sitting in lectures all day, it was about action. However, at the start of each 2-week sprint we were given lectures on key topics to get us going – and these were high quality lectures delivered by people at the top of their fields: the essence of each topic distilled clearly. They were absolutely invaluable.

And here’s a silly example of how I know it worked: before DSI I came across Chris Albon’s Machine Learning Flash Cards and thought “Wow, these are awesome, but I don’t understand most of the topics on them to begin with – maybe one day…”. After DSI I went back and had a look and thought “Wow, these are awesome, and a great reminder of everything I’ve learned, I’ll buy them!”

The key lesson I’ve learned from this is to begin any project by getting a good understanding of the territory before just diving in. There is a lot of publicly available information, but finding one or two really excellent in-depth articles and spending time on them can be a great starting point.

A light, supportive touch

We had full-time access to wonderful tutors throughout the course so we felt very supported. But at the same time we were left to work independently and encouraged to find our own solutions before asking for help.

I was left with a very real sense of being able to cope with feeling out of my depth, and being able to push through technical, intellectual and psychological barriers. After DSI I can truly say “I’m not afraid”!

What I also loved was that the support we were given took into account our technical AND emotional needs: mental health, support systems, and good coping skills are indispensable when working continually on the outer edges of your comfort zone. It needs to be a joyous experience, not a draining one!

The layered approach to learning

Deep learning was, of course, a big focus – and not the easiest topic to understand! Our first project in this space (detecting pneumonia in chest X-rays) felt beyond challenging! I’d lie awake at night, tensors transforming continuously behind my exhausted sleepless eyeballs, the math of the YOLO algorithm running on instant replay but refusing to turn into code!

On the second pass (toxic language detection) it miraculously all seemed to make so much more sense!

I’m now busy with the 3rd pass (PyTorch Scholarship Challenge) and the transformations and math are really starting to seem both very clever and rather intuitive.

I think the main lesson I’m going to take from this is: don’t get stuck, move forward and then circle back if necessary. It’s actually amazing how, even when you’re not actively trying to learn, the subconscious is working away on processing information and it really DOES seem easier the second or third time around.

Don’t be afraid to look stupid!

Before I went to DSI only a handful of people had ever seen my code and I was completely used to (and comfortable with!) working in isolation. I remember my horror when, one afternoon, our visiting lecturer from Netflix (yes, actually!) came to sit next to me and watch me trying to do a natural language processing tutorial. I started sweating despite the cold day, I seemed to suddenly forget basic Pandas syntax that I was really quite familiar with! Yet he persevered, pointing out where I could put stuff in functions and optimize – and it turned out to be a really nice experience once I calmed down a bit!

We were constantly exhorted to ASK the stupid question. One of our visiting lecturers from the University of Essex had a way of asking intermittently “Any doubt?” and I find this little phrase running constantly through my head now… you learn something and it’s basically fine but there’s often a small voice in your head going “Yes, but…” That’s the time to ask the stupid question: find a colleague, head to StackOverflow, try something new to test a theory…

Be systematic

Machine learning is essentially conducting many experiments to establish exactly what will work the best. Start early and track what has been tried, and with what results. It will save an enormous amount of time later on – otherwise you WILL end up re-testing things you already tried because you can’t remember what happened! Github versioning is a brilliant way to do this, even for personal projects.

Minimum viable product

I love this tweet: fun fact: MVP actually stands for “broken crap you built in a day”, but really MVP has its place and I’ve grown to appreciate the concept – especially as a perfectionist!

On DSI we were asked to deliver, and deliver fast. In addition to the 2-week sprints, we had to make our first Kaggle submissions by Day 3 (having spent most of Day 1 and Day 2 in lecture mode) – and we did, miraculously! And once done with the challenge we were typically assigned the “DSI twist” which could be anything from being given 24 hours to pitch a new product to being given 48 hours to complete a completely different challenge and demo it. Under pressure, we worked late and stressed out, but the focus was intense and we got it done! The much-discussed Pareto principle was proved: 80% of the results come from 20% of the effort.

People still make the world go round!

I’m not a particularly gregarious person. The idea of spending time with a large group of strange people is actually quite scary to me! But to my colleagues on DSI, I can only say: you reminded me how important and wonderful human relationships are ♡. When we struggled it was good to struggle together, when we had bad days we knew we were still learning together – even if a little uncomfortably, and when we triumphed it was exhilarating to congratulate each other on a job well done!

Screenshot 2018-11-30 at 08.43.06

Thank you for the privilege of being a part of it all!



One thought on “Emerging from Data Science Intensive

Comments are closed.

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: