微软推出 Mu：一款用于 Windows 设置的轻量级设备端语言模型

发布于 6 月 26 日

Microsoft Introduces Mu: A new small-scale language model designed to run locally on NPUs, initially deployed in Windows Settings for Copilot+ PCs. It allows users to control system settings using natural language and reduces reliance on cloud-based processing.
- Architecture Details: A 330 million parameter encoder-decoder transformer optimized for edge devices. Reduces latency by reusing encoded input representations compared to decoder-only models.
Performance on Qualcomm Hexagon NPU: Achieves a 47% reduction in first-token latency and nearly five times faster decoding compared to similar-sized decoder-only models. Benefits from rotary positional embeddings, grouped-query attention, dual LayerNorm, and model quantization techniques.
Fine-tuning for Windows Settings Agent: Fine-tuned on over 3.6 million examples across hundreds of adjustable settings using synthetic data generation, noise injection, prompt tuning, and low-rank adaptation. Typical response times are under 500 milliseconds.
Availability and Fallback System: Currently available to Windows Insiders in the Dev Channel with Copilot+ devices. Has a fallback system for unclear input by showing regular search results.
Industry Reactions:
- Michał Choiński: If Mu performs consistently, it could redefine the desktop AI experience.
- Muhammad Akif: If it maintains performance, it could shift the AI narrative from "cloud-first" to "device-smart".
- George Draco: A big leap for on-device AI. Offline speed with contextual memory changes productivity tool thinking.
Future Plans: Expand support to more settings categories and improve performance on short queries as it becomes a foundation for broader on-device AI capabilities.

阅读 247