This page aims for the useful middle ground between “yes, technically” and “yes, you would actually want to use it”. It uses conservative, approximate engineering judgement to estimate what kinds of local AI workloads make sense on Windows PC, Linux PC, Mac, and Raspberry Pi 5 16GB.
It is an explainer and estimator. The memory bands are indicative. A model that loads is not automatically a model that feels good to use.
With the same NVIDIA GPU, Windows and Linux often support similar model sizes. Linux often has stronger local AI tooling culture; Windows is often more convenient for mixed general desktop use.
Unified memory can sometimes let larger models load than a small-VRAM discrete GPU would allow, but speed and software compatibility can differ.
Tiny LLMs, lightweight speech models, and some local transcription can be feasible. Serious image generation is strained; serious video generation is usually a bad fit.
Likely sensible for normal local use with workable speed and some headroom.
Usable with compromises such as shorter context, lower resolution, or patience.
Technically possible, but often slow, awkward, or compromised enough to be a hobby project.
Either very poor on the chosen hardware or simply not a sensible local target.
Showing heuristic results for the current platform and memory target.
Treat the Pi 5 as a low-power local inference box, not as a desktop AI workstation. It can do useful work, but the envelope is much smaller.
The dataset is intentionally curated rather than exhaustive. It uses representative families that are commonly discussed for local use and keeps the memory bands indicative.
The practical bands in this page are more conservative than raw model-card claims. That is deliberate.