Upgrading the Newsroom

Upgrading the Newsroom

Automated image selection system for news articles

The system fuses multiple textual sources (caption, body, headline, lead) from news articles using a hierarchical attention mechanism to retrieve relevant images. It utilizes subword embeddings and self-attention to better encode entities and capture important keywords within texts. The model is trained on a large-scale multimodal multilingual dataset of over 500k German and French news article-image pairs in a weakly-supervised manner.

AdversarialMachine LearningNatural Language
Key facts
Maturity
Support
C4DT
Inactive
Lab
Unknown

Distributed Information Systems Laboratory

Distributed Information Systems Laboratory
Karl Aberer

Prof. Karl Aberer

Research in the Distributed Information Systems Laboratory focuses on producing reliable information from the vast amount of data that is available on the Internet – a key challenge in today’s information society.
They are developing methods and systems that turn unstructured, heterogeneous and untrusted data into meaningful, reliable and understandeable information.
This is done in the context of concrete information processing tasks, such as data and knowledge integration, information retrieval, filtering and extraction, document understanding and trust and crediblity assessment.
Given that tackling these problem relies usually on the needs of the user and requires at the same time processing of large amounts of data, they explore methods that enable integration of human knowledge with state-of-the-art machine learning.
They apply the results of their work in concrete application domains, such as Media, Humanitarian Action and Knowledge Management in enterprises.

This page was last edited on 2024-03-19.