metadata
license: apache-2.0
language:
- en
base_model:
- deepseek-ai/deepseek-vl-1.3b-chat
pipeline_tag: image-to-text
Deepseek-VL-1.3b-chat-4bit
Overview
Deepseek-VL-1.3b-chat-4bit is a state-of-the-art multimodal model that combines visual and linguistic processing capabilities. It has been optimized for efficient performance by quantizing the model to 4 bits, significantly reducing its size while maintaining high performance.
Model Details
- Model Type: Multimodal Causal Language Model
- Base Model Size: 1.3 billion parameters
- Quantized Size: Approximately 1.72 GB (from the original size)
- Files Included:
config.json
: Model configuration file.model.safetensors
: The quantized model weights.preprocessor_config.json
: Configuration for the preprocessor.processor_config.json
: Configuration for the processor.special_tokens_map.json
: Mapping for special tokens used in the tokenizer.tokenizer.json
: Tokenizer configuration.tokenizer_config.json
: Additional tokenizer settings.
Quantization
Quantization is a technique used to reduce the model size and improve inference speed by using lower precision arithmetic. In this case, the model was quantized to 4 bits, which means it utilizes 4 bits to represent each weight instead of the typical 16 or 32 bits. This results in:
- Size Reduction: The model size has been reduced from several gigabytes to approximately 1.72 GB.
- Performance: The quantized model maintains a high level of accuracy and efficiency, making it suitable for deployment in environments with limited resources.
Installation
To use the Deepseek-VL-1.3b-chat-4bit model, follow these steps:
- Install the Required Libraries:
pip install transformers huggingface-hub