4 days agoHugging Face Daily PapersVideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos
4 days agoHugging Face Daily PapersFreeTimeGS: Free Gaussians at Anytime and Anywhere for Dynamic Scene Reconstruction
4 days agoHugging Face Daily PapersSparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs
4 days agoHugging Face Daily PapersMINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
4 days agoHugging Face Daily PapersAV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs
4 days agoHugging Face Daily PapersRevisiting Depth Representations for Feed-Forward 3D Gaussian Splatting
4 days agoHugging Face Daily PapersSeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training
4 days agoHugging Face Daily PapersEOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
4 days agoAI News CN (Telegram) - English TranslationReddit sues Anthropic for breach of contract and unfair competition
4 days agoHugging Face Daily PapersMicro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning
4 days agoAI News CN (Telegram) - English TranslationGoogle has postponed the launch of its "Ask Your Photos" AI search feature.