Today I finally received my MSc certificate from the University of London 🎉. By way of celebration I thought I would make my final (unabridged) project publicly available. There are 3 main components:

The full project repo, which contains all the code that was used to build a knowledge graph from 2081 articles published by News24 on the topic of the Zondo Commission.

As per agreement with Media24 (who owns the articles), a sample of 30 of the unlocked articles is supplied with the code in parquet format so that all code is demonstrably reproducible.

The final thesis which includes the full literature review, aims and objectives, methodology and results.
If you prefer a lighter summary, I’d recommend the short podcast which summarizes the main highlights very nicely! One of the most interesting outcomes of this research was how using human-in-the-loop techniques can add value at each stage of the process, since one of the bigger challenges with knowledge graph construction is that there is no labelled data against which to evaluate outcomes. This short video focuses on this aspect and may also be of interest!
