21 days agoHugging Face Daily PapersPuzzled by Puzzles: When Vision-Language Models Can't Take a Hint
21 days agoHugging Face Daily PapersLoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
21 days agoHugging Face Daily PapersDeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning
21 days agoHugging Face Daily PapersSpatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
21 days agoHugging Face Daily PapersZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS
21 days agoHugging Face Daily PapersAnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views
21 days agoHugging Face Daily PapersVF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos
21 days agoHugging Face Daily PapersGSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
21 days agoHugging Face Daily PapersVideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models