ASUS NUC Pro 14 (Strix Halo) — the best mini-PC for local LLMs

96 GB of unified memory, an NPU rated at 50 TOPS, and a 3-liter chassis: the ASUS NUC Pro 14 is purpose-built for local AI inference without a discrete GPU.

The Strix Halo platform finally makes mini-PC local inference viable for 70B-class models.

Why unified memory changes everything

Discrete GPU inference requires moving data across the PCIe bus. Strix Halo’s unified memory pool — up to 128 GB shared between CPU and GPU — eliminates that bottleneck. At 96 GB, you can run Llama 3 70B fully in memory with context to spare.

LLM throughput

ASUS NUC Pro 14 — tokens/sec (Ollama)

Llama 3 8B (Q4)

68.4 tok/s

Llama 3 70B (Q4)

13.2 tok/s

Phi-3 Mini

142.1 tok/s

Verdict

If you want to run frontier-class models locally without a dual-RTX rig, the Strix Halo platform is the most pragmatic answer available in 2026.