IntelLabs
/

LlavaOLMoBitnet1B

Model card Files Files and versions Community

naveensp commited on 27 days ago

Commit

5a23ab0

•

1 Parent(s): 44add6b

Update README.md

Files changed (1) hide show

README.md +21 -13

README.md CHANGED Viewed

@@ -10,22 +10,30 @@ Multimodal Large Language Models (MM-LLMs) have seen significant advancements in
 Authors: Jainaveen Sundaram, Ravishankar Iyer
-### Training Data
 Two step training pipeline outlined in the LLaVa1.5 paper, consisting of two phases: (1) A Pre-training phase for feature alignment followed by an (2) End-to-end instruction fine-tuning
 The pre-training phase involves 1 epoch on a filtered subset of 595K Conceptual Captions [2], with only the projection layer weights updated. For instruction fine-tuning, we use 1 epoch of the LLaVa-Instruct-150K dataset, with both projection layer and LLM weights updated.
-### Evaluation Data
-TODO - Add info on Eval Data (if applicable)
-## How to use
-TODO - Add clear instructions on how to use the model
-## Evaluation Results
-TODO - Add results data
 ## Model Sources
-TODO - Add links to github or arxiv
 ## Ethical Considerations
@@ -33,15 +41,15 @@ Intel is committed to respecting human rights and avoiding causing or contributi
 | Ethical Considerations | Description |
 | ----------- | ----------- |
-| Data | **TODO** |
-| Human life | **TODO** |
-| Mitigations |  **TODO**  |
-| Risks and harms |  **TODO** |
 | Use cases | - |
 ## Citation
-**TODO** Add Citation information (if applicable )
 ## License

 Authors: Jainaveen Sundaram, Ravishankar Iyer
+### Training details and Evaluation
 Two step training pipeline outlined in the LLaVa1.5 paper, consisting of two phases: (1) A Pre-training phase for feature alignment followed by an (2) End-to-end instruction fine-tuning
 The pre-training phase involves 1 epoch on a filtered subset of 595K Conceptual Captions [2], with only the projection layer weights updated. For instruction fine-tuning, we use 1 epoch of the LLaVa-Instruct-150K dataset, with both projection layer and LLM weights updated.
+For model evaluation, please refer to the linked technical report (coming soon!).
+### How to use
+Start off by cloning the repository:
+git clone https://huggingface.co/IntelLabs/LlavaOLMoBitnet1B
+cd LlavaOLMoBitnet1B
+Install all the requirements by following instructions on requirements.txt
+You are all set! Run inference by calling:
+python llava_olmo.py
+To pass in your own query, modify the following lines within the file:
+#Define Image and Text inputs..
+text = "Be concise. What are the four major tournaments of the sport shown in the image?"
+url = "https://farm3.staticflickr.com/2157/2439959136_d932f4e816_z.jpg"
 ## Model Sources
+Arxiv link for technical report coming soon!
 ## Ethical Considerations
 | Ethical Considerations | Description |
 | ----------- | ----------- |
+| Data | The model was trained using the LLaVA-v1.5 data mixture as described above.|
+| Human life | The model is not intended to inform decisions central to human life or flourishing. |
+| Mitigations |  No additional risk mitigation strategies were considered during model development.  |
+| Risks and harms | This model has not been assessed for harm or biases, and should not be used for sensitive applications where it may cause harm. |
 | Use cases | - |
 ## Citation
+Coming soon
 ## License