About

ANSES logo colour

This project integrated a video scene change algorithm with text segmentation and summarisation techniques to develop an automatic news summarisation and extraction system.

Television broadcast news was captured in both video/audio format along with accompanying subtitles in text format. News stories were identified, extracted from the video, and summarised into short paragraphs, reducing the information to a more manageable size. Individual news video clips could be effectively retrieved through a combination of video and text, utilising a reverse-indexed search engine. This system provided distilled information, including a summarised version of the original text while highlighting key words within the content.

Application

The application was designed to be straightforward to use. It opened with the latest available news, divided into individual stories. Each story was displayed in a panel with three key components:

  • Video key frames across the top, with a link to play the video.
  • Key words listed on the left, categorised into organisations, people, locations, and dates.
  • A central panel containing a text summary of the story, its date, and a link to the full article.

At the top of the main page, drop-down boxes allowed users to select and display news from previous days. Below the date selection area, a link directed users to the search page, where they could query the news archive by text and date range. The text query searched the key words extracted from each story.