Proposes a framework for inserting entities into Wikipedia articles across multiple languages. It processes Wikipedia dumps to extract data and train models for entity insertion. The key components are: 1) Data processing pipeline to extract relevant data from Wikipedia dumps. 2) Modeling code for training entity insertion models using a ranking loss or pointwise loss. 3) Benchmarking code to evaluate models against baselines like BM25, EntQA, and GPT language models.
This page was last edited on 2024-04-16.
This page was last edited on 2024-04-16.