Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges
Published in arXiv, 2024
In this work we benchmark several of the top data contamination approaches on the state-of-the-art LLMs in order to provide insight to the strengths and limitations of each approach.