3 months agoHugging Face Daily PapersDeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?
3 months agoAI News CN (Telegram) - English TranslationTesting found that ChatGPT reflected human decision - making biases in nearly half of the scenarios.
3 months agoHugging Face Daily PapersThe AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
3 months agoHugging Face Daily PapersC3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing
3 months agoHugging Face Daily PapersGeo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction
3 months agoHugging Face Daily PapersVisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
3 months agoHugging Face Daily PapersCCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy
3 months agoHugging Face Daily PapersVCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning
3 months agoHugging Face Daily PapersScaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models
3 months agoHugging Face Daily PapersSoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
3 months agoHugging Face Daily PapersVL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
3 months agoHugging Face Daily PapersSFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
3 months agoHugging Face Daily PapersColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness