Edit model card

Model Card for Llama2-7b-chat-HF-NF4

Model Details

This model is a NF4 quantized version of the meta-llama/Llama-2-7b-chat-hf model.

  • Developed by: Ted Whooley
  • Library: Transformers, NF4
  • Model type: llama
  • Model name: Llama2-7b-chat-HF-NF4
  • Pipeline tag: text-generation
  • Qunatized by: twhoool02
  • Language(s) (NLP): en
  • License: other
Downloads last month
5
Safetensors
Model size
3.6B params
Tensor type
F32
FP16
U8
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for twhoool02/Llama2-7b-chat-HF-NF4

Quantized
this model