Microsoft releases the first 1-bit model with over 2 billion parameters

Microsoft Releases Its First 1-bit Model with Over 2 Billion Parameters

On the 17th, it was reported that Microsoft released the 1-bit model BitNet b1.58 LLM family with 2 billion parameters this week. The company claims that this new model is more memory - efficient and energy - saving than mainstream Transformer LLMs, and is suitable for execution on CPUs or smaller hardware platforms. This is the first open - source native 1-bit LLM with 2 billion parameters. It was trained on a 4 - trillion - word dataset and has a context length of 4096 tokens. According to the comparison tests by the research team, the memory occupancy of the BitNet b1.58 - 3B/3.9B versions is 2.22GB and 2.38GB respectively, which is much less than the 7.89GB of LLaMA - 3B. In terms of latency, the BitNet b1.58 - 3B/3.9B are 1.87ms and 2.11ms respectively, better than the 5.07ms of LLaMA - 3B. The PPL and zero - shot training accuracy of the two BitNet b1.58 versions also exceed those of LLaMA - 3B.

—— ithome tw, Microsoft's Open - source Address

via Fengxiangqi Reference Express - Telegram Channel