Technical University of Darmstadt Reveals Advancements in LLM Training for Data Augmentation

#ai #llm #machinelearning

Researchers at the Technical University of Darmstadt’s Ubiquitous Knowledge Processing Lab have discovered a technique to enhance the performance of smaller language models (SLMs) in extractive question answering. Led by Rachneet Sachdeva, Martin Tutek, and Iryna Gurevych, the team utilized large language models (LLMs) for data augmentation by generating counterfactual (CF) instances — minimally modified input data — which notably improved out-of-domain (OOD) performance. Their study, “CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration,” indicates that the diversity of CF instances, both in form and content, is key to mitigating spurious correlations and addressing data distribution disparities.

DEV Community

Technical University of Darmstadt Reveals Advancements in LLM Training for Data Augmentation

Top comments (0)