1-800-BAD-CODE commited on
Commit
cad4273
1 Parent(s): b629ea4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -198,18 +198,18 @@ In these metrics, keep in mind that
198
  ## Test Data and Example Generation
199
  Each test example was generated using the following procedure:
200
 
201
- 1. Concatenate 10 random sentences
202
  2. Lower-case the concatenated sentence
203
  3. Remove all punctuation
204
 
205
  The data is a held-out portion of News Crawl, which has been deduplicated.
206
- 3,000 lines of data per language was used, generating 3,000 unique examples of 10 sentences each.
207
- The last 4 sentences of each example were randomly sampled from the 3,000 and may be duplicated.
208
-
209
- Examples longer than the model's maximum length were truncated.
210
- The number of affected sentences can be estimated from the "full stop" support: with 3,000
211
- sentences and 10 sentences per example, we expect 30,000 full stop targets total.
212
 
213
  ## Selected Language Evaluation Reports
 
 
214
 
 
215
 
 
198
  ## Test Data and Example Generation
199
  Each test example was generated using the following procedure:
200
 
201
+ 1. Concatenate 11 random sentences (1 + 10 for each sentence in the test set)
202
  2. Lower-case the concatenated sentence
203
  3. Remove all punctuation
204
 
205
  The data is a held-out portion of News Crawl, which has been deduplicated.
206
+ 3,000 lines of data per language was used, generating 3,000 unique examples of 11 sentences each.
207
+ We generate 3,000 examples, where example `i` begins with sentence `i` and is followed by 10 random
208
+ sentences selected from the 3,000 sentence test set.
 
 
 
209
 
210
  ## Selected Language Evaluation Reports
211
+ For now, metrics for a few selected languages are shown below.
212
+ Given the amount of work required to collect pretty metrics in 47 languages, I'll add more eventually.
213
 
214
+ Expand any of the following tabs to see metrics for that language.
215