Inactive

LAMEN

Evaluating language models through negotiation tasks

The project proposes using structured negotiations as a dynamic benchmark for evaluating language model (LM) agents. The negotiation framework consists of a game setting, issues to negotiate, and optional preference weights, allowing for the design of complex games by increasing the number of issues, mixing issue types, and adding non-uniform preferences. The benchmark setup jointly evaluates performance metrics (utility and completion rate) and alignment metrics (faithfulness and instruction-following) in self-play and cross-play settings.

Machine LearningNatural LanguageOptimization

Key facts

Maturity

Support

C4DT

Lab

Key facts

Maturity

Support

C4DT

Lab

Technical

Source code: Lab Github
Last commit: 2024-05-03

Data Science Lab

Prof. Robert West

Our research aims to make sense of large amounts of data. Frequently, the data we analyze is collected on the Web, e.g., using server logs, social media, wikis, online news, online games, etc. We distill heaps of raw data into meaningful insights by developing and applying algorithms and techniques in areas including social and information network analysis, machine learning, computational social science, data mining, natural language processing, and human computation.

Go back

This page was last edited on 2024-05-03.

Go back

This page was last edited on 2024-05-03.