Available snaps¶

This page contains a list of inference snaps and the available optimizations.

DeepSeek R1¶

DeepSeek R1 is a reasoning Large Language Model mainly meant for chat completions. Input and output is text-based.

This inference snap is optimized for the following hardware:

Arch	Optimization	Description
amd64	Intel GPU	Optimized for Intel integrated and discrete graphics
amd64	Intel NPU	Intel Neural Processing Unit acceleration
amd64	Intel CPU	Intel-specific CPU optimizations
amd64	NVIDIA GPU	CUDA-enabled GPU acceleration
arm64	Ampere Altra/One CPUs	Optimized for Ampere processors

Once installed, use list-engines and show-engine commands to explore the available engines.

Gemma3¶

Gemma 3 is a Large Language Model supporting text and image inputs, with text-based outputs.

This inference snap is optimized for the following hardware:

Arch	Optimization	Description
amd64	Generic CPU	Optimized for several x86 CPU variants
amd64	Intel CPU	Optimized for best performance on Intel CPUs
amd64	Intel GPU	Optimized for Intel integrated and discrete graphics
arm64	Generic CPU	Optimized for armv8 and armv9 CPUs
amd64	NVIDIA GPU	CUDA-enabled GPU acceleration
arm64	NVIDIA GPU	CUDA-enabled GPU acceleration on arm64 platforms

Once installed, use list-engines and show-engine commands to explore the available engines.

Nemotron 3 Nano¶

Nemotron 3 Nano is an LLM designed for both reasoning and non-reasoning tasks. The input and output are text-based.

The inference snap for Nemotron 3 Nano has been optimized for the following:

Arch	Optimization	Description
amd64	Generic CPU	Optimized for several x86 CPU variants
amd64	NVIDIA GPU	CUDA-enabled GPU acceleration on x86 platforms
arm64	Generic CPU	Optimized for armv8 and armv9 CPUs
arm64	NVIDIA GPU	CUDA-enabled GPU acceleration on arm64 platforms

Once installed, use list-engines and show-engine commands to explore the available engines.

Qwen VL¶

Qwen VL is a Vision Language Model which has the ability to process both visual and textual data. The input can be a combination of an image and text, with the output being text-based.

The inference snap for Qwen 2.5 VL has been optimized for the following:

Arch	Optimization	Description
amd64	Intel GPU	Optimized for Intel integrated and discrete graphics
amd64	Intel NPU	Intel Neural Processing Unit acceleration
amd64	Intel CPU	Intel-specific CPU optimizations
amd64	NVIDIA GPU	CUDA-enabled GPU acceleration
arm64	Ampere Altra/One CPUs	Optimized for Ampere processors

Once installed, use list-engines and show-engine commands to explore the available engines.