llava-llama-3-8b-v1_1-hf is a LLaVA model fine-tuned from meta-llama/Meta-Llama-3-8B-Instruct and [CLIP-ViT-Large-patch14-336] coco dataset for visual finetune(LoRA-deepspeed): https://github.com/InternLM/xtuner/tree/main/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336 wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip