Topic segmentation

Context

A business news radio station that has approx. 10 hours of original radio broadcast per day

Problem

The station wanted to offer its audience a curated playlist of news content so that each listener could select preferred topics and receive a tailored supply of the latest news segments.

None of the audio content was labeled and no one kept track of what time each topic was being discussed. There was a need need to automate this process.

Solution

We trained an AI model to label the news segments. This was a complex task as it required the computer to understand audio and natural language.

The first step was to undertake audio feature detection, followed by the detection of semantic overlap. This process was then refined in a semi-supervised way. We then started topic modeling and were able to undertake the segment classification.

Results

  • Detect topics in segments with 90% accuracy
  • Cut audio segments with 80% accuracy
  • Provide relevant user content in 70% of all cases