Detecting Crime waves in an Extensive Text Corpus of (Online) Crime News: TheGuardian.co.uk as a Test Case.
The large scale digitization of crime related news articles offers a wide range of possibilities for research. At the same time this demands new methods and approaches that might challenge criminologists. In this article we’d like to demonstrate how we can use the available – but often unstructured -data and open up its potential through datamining. More specifically, we demonstrate how we can detect the main crime related topics in the extensive text corpus of TheGuardian.co.uk and spot potential crime waves in these news messages. Our aim was not so much to use complete and state-of-the-art algorithms. Nevertheless, our results proved to be promising, inventorying 21 main topics in the news articles and detecting the 5 types of waves proposed by Geiß (2018), plus an additional type of wave.