metadata

license: apache-2.0
language:
  - en
base_model:
  - deepseek-ai/deepseek-vl-1.3b-chat
pipeline_tag: image-to-text

Deepseek-VL-1.3b-chat-4bit

Overview

Deepseek-VL-1.3b-chat-4bit is a state-of-the-art multimodal model that combines visual and linguistic processing capabilities. It has been optimized for efficient performance by quantizing the model to 4 bits, significantly reducing its size while maintaining high performance.

Model Details

Model Type: Multimodal Causal Language Model
Base Model Size: 1.3 billion parameters
Quantized Size: Approximately 1.72 GB (from the original size)
Files Included:
- config.json: Model configuration file.
- model.safetensors: The quantized model weights.
- preprocessor_config.json: Configuration for the preprocessor.
- processor_config.json: Configuration for the processor.
- special_tokens_map.json: Mapping for special tokens used in the tokenizer.
- tokenizer.json: Tokenizer configuration.
- tokenizer_config.json: Additional tokenizer settings.

Quantization

Quantization is a technique used to reduce the model size and improve inference speed by using lower precision arithmetic. In this case, the model was quantized to 4 bits, which means it utilizes 4 bits to represent each weight instead of the typical 16 or 32 bits. This results in:

Size Reduction: The model size has been reduced from several gigabytes to approximately 1.72 GB.
Performance: The quantized model maintains a high level of accuracy and efficiency, making it suitable for deployment in environments with limited resources.

Installation

To use the Deepseek-VL-1.3b-chat-4bit model, follow these steps:

Install the Required Libraries:

pip install transformers huggingface-hub