Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ This is a multimodal implementation of [Phi2](https://huggingface.co/microsoft/p
|
|
17 |
2. Vision Tower: [clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336)
|
18 |
4. Pretraining Dataset: [LAION-CC-SBU dataset with BLIP captions(200k samples)](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain)
|
19 |
5. Finetuning Dataset: [Instruct 150k dataset based on COCO](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K)
|
20 |
-
6. Finetuned Model: [
|
21 |
|
22 |
|
23 |
### Model Sources
|
@@ -26,7 +26,7 @@ This is a multimodal implementation of [Phi2](https://huggingface.co/microsoft/p
|
|
26 |
|
27 |
- **Original Repository:** [Llava-Phi](https://github.com/zhuyiche/llava-phi)
|
28 |
- **Paper [optional]:** [LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model](https://arxiv.org/pdf/2401.02330)
|
29 |
-
- **Demo [optional]:** [Demo Link](https://huggingface.co/spaces/
|
30 |
|
31 |
|
32 |
## How to Get Started with the Model
|
@@ -47,7 +47,7 @@ pip install -e .
|
|
47 |
3. Run the Model
|
48 |
```bash
|
49 |
python llava_phi/eval/run_llava_phi.py --model-path="RaviNaik/Llava-Phi2" \
|
50 |
-
--image-file="https://huggingface.co/
|
51 |
--query="How many people are there in the image?"
|
52 |
```
|
53 |
|
|
|
17 |
2. Vision Tower: [clip-vit-large-patch14-336](https://huggingface.co/openai/clip-vit-large-patch14-336)
|
18 |
4. Pretraining Dataset: [LAION-CC-SBU dataset with BLIP captions(200k samples)](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain)
|
19 |
5. Finetuning Dataset: [Instruct 150k dataset based on COCO](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K)
|
20 |
+
6. Finetuned Model: [Navyabhat/Llava-Phi2](https://huggingface.co/Navyabhat/Llava-Phi2)
|
21 |
|
22 |
|
23 |
### Model Sources
|
|
|
26 |
|
27 |
- **Original Repository:** [Llava-Phi](https://github.com/zhuyiche/llava-phi)
|
28 |
- **Paper [optional]:** [LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model](https://arxiv.org/pdf/2401.02330)
|
29 |
+
- **Demo [optional]:** [Demo Link](https://huggingface.co/spaces/Navyabhat/MultiModal-Phi2)
|
30 |
|
31 |
|
32 |
## How to Get Started with the Model
|
|
|
47 |
3. Run the Model
|
48 |
```bash
|
49 |
python llava_phi/eval/run_llava_phi.py --model-path="RaviNaik/Llava-Phi2" \
|
50 |
+
--image-file="https://huggingface.co/avyabhat/Llava-Phi2/resolve/main/people.jpg?download=true" \
|
51 |
--query="How many people are there in the image?"
|
52 |
```
|
53 |
|