Microsoft Introduces Mu: A new small-scale language model designed to run locally on NPUs, initially deployed in Windows Settings for Copilot+ PCs. It allows users to control system settings using natural language and reduces reliance on cloud-based processing.
- Architecture Details: A 330 million parameter encoder-decoder transformer optimized for edge devices. Reduces latency by reusing encoded input representations compared to decoder-only models.
- Performance on Qualcomm Hexagon NPU: Achieves a 47% reduction in first-token latency and nearly five times faster decoding compared to similar-sized decoder-only models. Benefits from rotary positional embeddings, grouped-query attention, dual LayerNorm, and model quantization techniques.
- Fine-tuning for Windows Settings Agent: Fine-tuned on over 3.6 million examples across hundreds of adjustable settings using synthetic data generation, noise injection, prompt tuning, and low-rank adaptation. Typical response times are under 500 milliseconds.
- Availability and Fallback System: Currently available to Windows Insiders in the Dev Channel with Copilot+ devices. Has a fallback system for unclear input by showing regular search results.
Industry Reactions:
- Michał Choiński: If Mu performs consistently, it could redefine the desktop AI experience.
- Muhammad Akif: If it maintains performance, it could shift the AI narrative from "cloud-first" to "device-smart".
- George Draco: A big leap for on-device AI. Offline speed with contextual memory changes productivity tool thinking.
- Future Plans: Expand support to more settings categories and improve performance on short queries as it becomes a foundation for broader on-device AI capabilities.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。