allenai
/

llama-3-tulu-2-70b-uf-mean-rm

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

hamishivi commited on Jun 21

Commit

9bf5329

•

1 Parent(s): 8921e92

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -29,7 +29,7 @@ We evaluate the model on [RewardBench](https://github.com/allenai/reward-bench):
 | Model            | Score | Chat  | Chat Hard | Safety | Reasoning | Prior Sets (0.5 weight) |
 |------------------|-------|-------|-----------|--------|-----------|-------------------------|
 | [Llama 3 Tulu 2 8b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm)  | 66.3  | 96.6  |    59.4   |  61.4  |    80.7   |                         |
-| **[Llama 3 Tulu 2 70b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm) (this model)** |       |       |           |        |           |                         |

 | Model            | Score | Chat  | Chat Hard | Safety | Reasoning | Prior Sets (0.5 weight) |
 |------------------|-------|-------|-----------|--------|-----------|-------------------------|
 | [Llama 3 Tulu 2 8b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm)  | 66.3  | 96.6  |    59.4   |  61.4  |    80.7   |                         |
+| **[Llama 3 Tulu 2 70b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm) (this model)** |    65.3   |   89.1    |      52.6     |    64.0    |      88.3     |                         |