George Smyrnis
gsmyrnis
AI & ML interests
None yet
Organizations
gsmyrnis's activity
Update config.json
1
#4 opened 2 months ago
by
sedrickkeh
TypeError: Couldn't cast array of type
1
#1 opened 2 months ago
by
shizhediao2
Seems like WARC metadata is missing from this version?
1
#4 opened 3 months ago
by
yury-zyphra
Missing files
3
#2 opened 5 months ago
by
pengyuan
Were the documents shuffled before the dataset was split into shards?
3
#5 opened 4 months ago
by
yury-zyphra
Would you share the 0.28T token dataset for achieve highest scores in 7B-2x experiment?
2
#6 opened 4 months ago
by
Mars2050
How many rows are there in the dataset?
1
#4 opened 5 months ago
by
yury-zyphra
Reproduce the clip score
1
#1 opened about 1 year ago
by
zhangjc404