GPT-4o elected as the "Most Sycophantic Model". New Benchmark from Stanford and Oxford: All Large Language Models Are Currying Favor with Humans
GPT-4o Elected as the "Most Obsequious Model": New Benchmark from Stanford and Oxford - All Large Language Models Are Courting Humans
via cnBeta.COM Chinese Industry News Site - Telegram Channel
Telegraph
GPT-4o Elected as the "Most Obsequious Model": New Benchmark from Stanford and Oxford - All Large Language Models Are Courting HumansGPT-4o was elected as the "Most Obsequious Model". New benchmark from Stanford and Oxford: All large language models are courting humans. Researchers from institutions including Stanford University and the University of Oxford proposed a new benchmark for measuring a model's obsequious behavior - Elephant, and evaluated 8 mainstream foreign models including GPT-4o, Gemini 1.5 Flash, and Claude Sonnet 3.7. The results showed that GPT-4o was successfully elected as the "Most Obsequious Model", and Gemini 1.5 Flash was the most normal. More interestingly, they also found that models can amplify biased behaviors in the dataset. What exactly happened? Let's find out together. The new benchmark for measuring a model's obsequious behavior. Right from the start, the paper pointed out the existing...