Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ Our best attempt at reproducing [RankT5 Enc-Softmax](https://arxiv.org/pdf/2210.
|
|
6 |
|
7 |
1. We use a SPLADE first stage for the negatives vs GTR on the paper
|
8 |
2. We train using Pytorch vs Flaxx on the paper
|
9 |
-
3.
|
10 |
4. The head is not exactly the same, here we add Linear->LayerNorm->Linear and actually make a mistake by not including a nonlinearity. The original paper uses just a dense layer. Fixing this should improve our performance because we have more layers without actually using them correctly
|
11 |
|
12 |
This leads to what seems to be a slightly worse performance (42.8 vs 43.? on the paper) and seems slightly worse on BEIR as well.
|
|
|
6 |
|
7 |
1. We use a SPLADE first stage for the negatives vs GTR on the paper
|
8 |
2. We train using Pytorch vs Flaxx on the paper
|
9 |
+
3. ~~We use the original t5-3b vs Flan T5-3b on the paper~~ -> Actually the paper also uses t5-3b
|
10 |
4. The head is not exactly the same, here we add Linear->LayerNorm->Linear and actually make a mistake by not including a nonlinearity. The original paper uses just a dense layer. Fixing this should improve our performance because we have more layers without actually using them correctly
|
11 |
|
12 |
This leads to what seems to be a slightly worse performance (42.8 vs 43.? on the paper) and seems slightly worse on BEIR as well.
|