Vector embedding techniques for media applications
News media is a powerful source of information that shapes social perception and behavior. The volume of news is increasing daily, from traditional high-quality news to less reliable content on social media. Monitoring and analysing media needs to be done in real time: classifying articles according to content, adding multiple categories of metainformation, summarising from news sources, performing analysis, and reporting. Recent advances in natural language processing, in particular the development of large pre-trained language models, enable the development of automated tools that can accurately process and categorise texts in various ways (e.g., by topic and sentiment) and generate summaries from multiple sources. However, even the best existing tools need to be improved to cope with the complexity of news category hierarchies, metadata structures used in the news industry, adaptation to specific user needs, and coverage of multiple languages. In this application project in collaboration with Kliping, d.o.o. we are developing advanced multilingual tools for analysing news and social media content that will help automate these processes while increasing society’s ability to understand the rapid flow of available information.
