nvidia/Llama-3.1-Nemotron-70B-Instruct · Not quite as good as I hoped.

DonaldSeibert

21 days ago

it did not successfully answer the question “how many examples of w show up in this sentence.” It appeared to understand the question but got confused on the answer
it also failed on the parity problem. It wrote some good python code, but it thought ‘1111’ had an even number of ones.
it did get a historical question correct, that a lot of.models get wrong (I ask if Ferdinand Porsche was a nazi, and it answered correctly that he was. (I believe that because a lot of the information on Porsche official or fan sites is evasive on this topic, the models that ingest the data tend to be similarly evasive.)

odelalleau

NVIDIA org 21 days ago

but it thought ‘1111’ had an even number of ones

Can you please double check this one? This is indeed an even number (4).

DonaldSeibert

21 days ago

Quite right, looks like I reversed the error. It first said there was an odd number of ones, and when I asked it to check changed its mind. Same with 1001.

kronosprime

17 days ago

it answered #1 fine for me:

A fun task!

Let's count the occurrences of the letter "w" in your sentence:

**Sentence:** "how many examples of 'w' show up in this sentence."

**Count:**

1. **how** - 1 "w"
2. **many** - no "w" (just a distraction with an "m")
3. **examples** - no "w"
4. **of 'w'** - 1 "w" (this one is even quoted!)
5. **show** - 1 "w"
6. **up** - no "w"
7. **in this sentence** - no "w"

**Total count of 'w' occurrences:** 3

So, there are **3 examples of 'w'** in your sentence. How's that for meta-counting?

DonaldSeibert

13 days ago

Sure, it sometimes works! But it didn't work when I tried it multiple times, so it's clearly not exactly reliable on a type of question that it advertised itself as capable of answering.