3 months agoHugging Face Daily PapersNileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities
3 months agoAI News CN (Telegram) - English TranslationGPT-4o elected as the "Most Sycophantic Model". New Benchmark from Stanford and Oxford: All Large Language Models Are Currying Favor with Humans
3 months agoHugging Face Daily PapersHard Negative Mining for Domain-Specific Retrieval in Enterprise Systems
3 months agoHugging Face Daily PapersArchitectural Backdoors for Within-Batch Data Stealing and Model Inference Manipulation
3 months agoAI News CN (Telegram) - English Translation🖼 Microsoft's May update package has skyrocketed to 4.3GB: Whether you use it or not, it forces 3GB of AI files on you
3 months agoHugging Face Daily PapersInstructPart: Task-Oriented Part Segmentation with Instruction Reasoning
3 months agoHugging Face Daily PapersTAGS: A Test-Time Generalist-Specialist Framework with Retrieval-Augmented Reasoning and Verification
3 months agoHugging Face Daily PapersGuided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence
3 months agoHugging Face Daily PapersBeyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
3 months agoHugging Face Daily PapersBiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases
3 months agoHugging Face Daily PapersFirst Finish Search: Efficient Test-Time Scaling in Large Language Models
3 months agoHugging Face Daily PapersTokBench: Evaluating Your Visual Tokenizer before Visual Generation
3 months agoHugging Face Daily PapersVideoGameBench: Can Vision-Language Models complete popular video games?