SynthIE leverages the asymmetry between text-to-triples and triples-to-text: it uses a large language model to generate fluent text from Wikidata (subject, relation, object) triples, creating a large synthetic corpus for training information-extraction models. The resulting dataset bootstraps high-quality IE models without requiring expensive manual annotation, and outperforms models trained on existing human-labeled corpora.
This page was last edited on 2024-04-16.
This page was last edited on 2024-04-16.