12 days agoHugging Face Daily PapersSegment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
12 days agoHugging Face Daily PapersSafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents
12 days agoHugging Face Daily PapersVAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning
12 days agoAI News CN (Telegram) - English TranslationNvidia and AMD's "new down - spec AI chips" are about to come out.
12 days agoAI News CN (Telegram) - English TranslationNvidia expects the限售 of H20 to cause an $8 billion loss in Q2.
12 days agoAI News CN (Telegram) - English Translation🤖AI Express: Latest Developments of DeepSeek, Telegram, Odyssey, etc.
12 days agoAI News CN (Telegram) - English TranslationLiang Wenfeng remained silent and just kept doing "minor updates".
12 days agoAI News CN (Telegram) - English TranslationThe pioneer mental health chatbot Woebot is about to shut down.
12 days agoHugging Face Daily PapersKVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
12 days agoHugging Face Daily PapersAfterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
12 days agoHugging Face Daily PapersUniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning