Hands-on impressions using a Snapdragon X Elite laptop

Local LLMs have exploded in popularity, and with new hardware like the Snapdragon X Elite, it’s finally practical to run powerful AI models entirely on your machine—fast, private, offline, and inexpensive.

In this guide, I’ll walk you through three of the most popular tools for running local models:

For each, I’ll cover:

  • How to install it

  • Minimum hardware requirements

  • Rough model availability

  • Pros and cons

  • My personal performance testing results

All tests were done on:


🧑‍💻 My Test Hardware

Microsoft Surface Laptop, 7th Edition

ComponentDetails
CPUSnapdragon X Elite (X1E80100), 12 cores @ 3.40 GHz
RAM32 GB
OSWindows 11 Home, Build 26200
GPU / AI AccelerationDirectX 12 / NPU support
NotesARM-based architecture

This hardware is extremely efficient for local inference—especially for optimized models.


1. Ollama

✔️ “The easiest way to run local LLMs… period.”

Ollama has become the go-to local model runner due to its simplicity. It's lightweight, command-line driven, and supports hundreds of models.


How to Install Ollama (Windows ARM)

  1. Go to: https://ollama.com/download

  2. Download the Windows ARM installer

  3. Run and follow the prompts

  4. Open PowerShell and test:

        ollama run llama3.1

That’s literally it.


Minimum Hardware Recommendations

ComponentMinimum
RAM8–16 GB
CPU4–8 cores
Disk4–20 GB free for models
GPUOptional — CPU-only supported

Ollama is extremely efficient on both Intel/AMD and ARM.


Model Availability

  • ~200+ public models

  • Many variants: Llama, Phi, Mistral, Gemma, Yi, Qwen, CodeGemma, etc.

  • Supports GGUF models

  • Can load custom local models


Pros

  • ✔ Ridiculously easy to install

  • ✔ Command-line interface is great for automation

  • ✔ Large community and ecosystem

  • ✔ Supports embeddings and system-level model management

  • ✔ Great for developers, scripting, and agents

Cons

  • ✖ No polished UI (unless you install third-party UIs)

  • ✖ Slower than LM Studio and Foundry in my tests

  • ✖ Some models load slower initially


My Testing Results (Snapdragon X Elite)

  • Easy to install and get running

  • Slower response time than LM Studio and Foundry

  • Stable and reliable, but not the fastest

  • Perfect for scripting, agents, and dev workflows


2. LM Studio

✔️ “The most polished local LLM experience.”

LM Studio is a desktop app with a great UI, a built-in model browser, and a fast runtime.


How to Install LM Studio

  1. Visit: https://lmstudio.ai/

  2. Download the Windows installer (ARM support is now available)

  3. Install and open the app

  4. Browse models and click Download + Run

LM Studio handles everything for you.


Minimum Hardware Recommendations

ComponentMinimum
    RAM     16 GB
    CPU      8 cores or better
    Disk    10–50 GB, depending on models
    GPU/NPU    Optional but beneficial

Model Availability

  • ~600+ models searchable in-app

  • HuggingFace integration

  • Auto-optimized versions for your architecture

  • Great model tagging and recommendations


Pros

  • ✔ Best UI out of the three

  • ✔ Very fast responses on ARM/CPU

  • ✔ Easy for beginners

  • ✔ Built-in chat, system prompts, memory, logs

  • ✔ Auto-downloads correct GGUF format

Cons

  • ✖ Larger install footprint

  • ✖ More RAM-intensive

  • ✖ Less automated for scripting/agents


My Testing Results (Snapdragon X Elite)

  • Easy to install and use

  • More polished interface than Ollama

  • Fast responding — the fastest of all three

  • Feels like ChatGPT running locally

  • Great candidate for daily use


3. Microsoft Foundry

✔️ “A lightning-fast, developer-focused local LLM runtime.”

Microsoft’s local LLM runtime (Foundry) is now one of the easiest and fastest ways to run models—especially on AI-enabled ARM hardware.

It’s command-line driven and built for developers.


How to Install Foundry (Windows PowerShell)

Open PowerShell and run:

    winget install Microsoft.Foundry

Then test:

    foundry chat run model phi-4

That’s it. It installs in seconds.


Minimum Hardware Recommendations

ComponentMinimum
RAM16 GB
CPU8 cores
GPU/NPUOptional, but highly optimized for NPU + ARM
Disk4+ GB

Model Availability

The list is smaller, but incredibly optimized.


Pros

  • ✔ Fastest local inference in my testing

  • ✔ Simple one-line installation

  • ✔ Native ARM + NPU acceleration

  • ✔ Command-line friendly

  • ✔ Small footprint

Cons

  • ✖ No GUI

  • ✖ More developer-centric

  • ✖ Smaller model catalog


My Testing Results (Snapdragon X Elite)

  • Simple to install via one PowerShell command
    Command-line based
    Extremely fast responses
    Best performance per watt on ARM/NPU
    Great for automation and agents

Final Comparison Table

Feature Ollama LM Studio Foundry
Ease of Install ★★★★★ ★★★★★ ★★★★★
UI Quality ★★☆☆☆ ★★★★★ ★☆☆☆☆
Speed ★★★☆☆ ★★★★★ ★★★★★
Model Variety ★★★★☆ ★★★★★ ★★☆☆☆
Developer Tools ★★★★★ ★★★☆☆ ★★★★★
Best Use Case Agents, scripting Daily chat, general use High-speed dev + NPU


Conclusion

If you're exploring local AI, any of these tools will work, but they each shine in different ways:

  • Ollama → Best for developers who want a simple, script-friendly tool

  • LM Studio → Best for everyday use with a polished interface

  • Foundry → Best performance on ARM/NPU and ideal for automation

Using the Snapdragon X Elite with 32 GB RAM, all three worked extremely well, with LM Studio and Foundry delivering the fastest response times.


Comments

Popular posts from this blog

Yes, Blazor Server can scale!

Offline-First Strategy with Blazor PWAs: A Complete Guide 🚀

Customizing PWA Manifest and Icons for a Polished User Experience 🚀