New

Is Microsoft’s New Mu For You?

Microsoft announced this week a new generative AI (genAI) system called Mu, and it’s a true glimpse into the future of how we’ll use everything, from PCs to toasters.

Mu lets people control their computers using plain language. For example, you can type or say, “turn on dark mode” or “make my mouse pointer bigger,” and the computer will do it. The first place Mu appears is in the Windows 11 Settings app. You say or type how you want a specific setting to change, and the genAI tool figures out what you want and makes the change for you.

Crucially, this isn’t a large language model (LLM) running in the cloud. Mu is a small language model (SLM) with a comparatively paltry 330 million parameters, built to run on a specialized AI chip called a neural processing unit, or NPU. (This chip is found in the latest Copilot+ PCs from Microsoft, Dell, HP, Lenovo, Samsung, and Acer. These new PCs started shipping in June 2024 and are the only computers that can use Mu and other advanced AI features in Windows 11.)

It’s not an LLM-based chatbot that lives in the cloud. It’s an SLM that runs entirely on the PC, even when disconnected from the internet.

Microsoft Copilot+ PCs can run Mu because they have an NPU that can handle at least 40 trillion operations per second. Microsoft collaborated with Qualcomm, AMD, and Intel to ensure Mu runs smoothly on their NPUs, which are now standard in all Copilot+ PCs.

Mu uses a transformer encoder-decoder design, which means it splits the work into two parts. The encoder takes your words and turns them into a compressed form. The decoder takes that form and produces the correct command or answer.

This design is more efficient than older models, especially for tasks such as changing settings. Mu has 32 encoder layers and 12 decoder layers, a setup chosen to fit the NPU’s memory and speed limits. The model utilizes rotary positional embeddings to maintain word order, dual-layer normalization to maintain stability, and grouped-query attention to use memory more efficiently. These technical choices let Mu process more than 100 tokens per second and respond in less than 500 milliseconds.

Compared with LLM-based chatbots like OpenAI’s ChatGPT, Mu is super fast.

Microsoft trained Mu on 3.6 million examples focused on Windows settings and related tasks. The training happened on Azure using NVIDIA A100 GPUs. After training, Microsoft fine-tuned Mu and used quantization to shrink its memory needs, so it would run well on NPUs from all three chipmakers. As a result, Mu is about one-tenth the size of Microsoft’s Phi-3.5-mini model, but performs almost as well for the tasks it was built to do.

Mu is truly groundbreaking because it is the first SLM built to let users control system settings using natural language, running entirely on a mainstream shipping device. Apple’s iPhones, iPads, and Macs all have a Neural Engine NPU and run on-device AI for features like Siri and Apple Intelligence. But Apple does not have a small language model as deeply integrated with system settings as Mu. Siri and Apple Intelligence can change some settings, but not with the same range or flexibility.

Samsung’s Galaxy S25 and other recent flagship phones feature a custom NPU and Galaxy AI, which can perform various device control and personal assistant tasks. However, they too lack an SLM for comprehensive system settings control.

Google’s Chromebook Plus devices have an NPU and support on-device AI, but it don’t use an SLM for system settings in the way Mu does.

By processing data directly on the device, Mu keeps personal information private and responds instantly. This shift also makes it easier to comply with privacy laws in places like Europe and the US since no data leaves your computer.

The industry is moving in this direction for obvious reasons. SLMs are now powerful enough to handle focused tasks on par with larger cloud-based models. They are cheaper to run, use less energy, and can be tailored for specific jobs or languages.

Note that NPUs are not rare. They’re currently available in new phones, tablets, and even home appliances. These chips are designed to run neural networks efficiently and with low power, making it possible to offer smart features that work anywhere, even without a reliable internet connection.

Most importantly, SLMs running on NPUs are a BFD — not just for PCs, phones, and tablets, but for everything. As the power and capabilities go up and the costs come down, we can expect car dashboards, thermostats, washing machines, tractors, and everything else (including toasters) to eschew nested menus for user control in favor of voice-controlled settings.

You’ll walk into the kitchen and tell the toaster to toast your bagel lightly in about 20 minutes before telling the coffee maker to make you a flat white. After breakfast, you’ll go into your home office and remotely control all manner of IoT devices and other objects by talking to an SLM dedicated to each device.

Note that these SLMs for device control will also work directly with LLMs for information and other actions, like writing code, building websites and apps, and facilitating all your business communications. That SLM you’ll be talking to will mainly live and execute locally on your smart glasses.

You may never own or use a Copilot+ PC. But you will definitely use something like Mu every day for most of your professional and personal life on many devices. It’s a true glimpse of the future of how we interact with machines.

Back to Listing

credit:

Is Microsoft’s New Mu For You?

Popular Products