AI hallucination benchmarking has emerged as a critical dimension for...
https://www.scribd.com/document/1013175958/When-Summaries-Lie-A-Case-study-of-Models-That-Summarize-Well-but-Fail-to-Admit-Ignorance-147755
AI hallucination benchmarking has emerged as a critical dimension for evaluating large language models, moving beyond traditional metrics like perplexity or BLEU scores