Model Depot - ONNX
Collection
Leading Models packaged in ONNX format optimized for use with AI PCs
•
20 items
•
Updated
llama-3.2-3b-instruct-onnx is an ONNX int4 quantized version of Llama 3.2 3B Instruct, providing a very small, very fast inference implementation, optimized for AI PCs using Intel GPU, CPU and NPU.
llama-3.2-3b-instruct is a new 3B chat foundation model from Meta.
Base model
meta-llama/Llama-3.2-1B-Instruct