allenai
/

Molmo-7B-O-0924

Image-Text-to-Text

text-generation

Model card Files Files and versions Community

chrisc36 commited on Sep 30

Commit

b14fbed

•

1 Parent(s): 9eb32aa

Update README.md

Files changed (1) hide show

README.md +22 -0

README.md CHANGED Viewed

@@ -96,6 +96,28 @@ print(generated_text)
 #     wooden deck. The deck's planks, which are a mix of light and dark brown with ...
 ```
 ## Evaluations
 | Model                       | Average Score on 11 Academic Benchmarks | Human Preference Elo Rating |

 #     wooden deck. The deck's planks, which are a mix of light and dark brown with ...
 ```
+To make inference more efficient, run with autocast:
+with torch.autocast(device_type="cuda", enabled=True, dtype=torch.bfloat16):
+  output = model.generate_from_batch(
+      inputs,
+      GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
+      tokenizer=processor.tokenizer
+  )
+We did most of our evaluations in this setting (autocast on, but float32 weights)
+To even further reduce the memory requirements, the model can be run with bfloat16 weights:
+model.to(dtype=torch.bfloat16)
+inputs["images"] = inputs["images"].to(torch.bfloat16)
+output = model.generate_from_batch(
+    inputs,
+    GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
+    tokenizer=processor.tokenizer
+)
+Note that this can sometimes change the output of the model compared to running with float32 weights.
 ## Evaluations
 | Model                       | Average Score on 11 Academic Benchmarks | Human Preference Elo Rating |