-
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 12 -
ProTIP: Progressive Tool Retrieval Improves Planning
Paper • 2312.10332 • Published • 7 -
Paloma: A Benchmark for Evaluating Language Model Fit
Paper • 2312.10523 • Published • 12 -
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 86
daje kang
daje
·
AI & ML interests
None yet
Organizations
None yet
Collections
1
models
17
daje/llama3.1-8B-naver_news-summary-llamafactory
Updated
•
4
daje/code-llama-7b-text-to-sql
Updated
daje/chapter5_code-llama3-8B-text-to-sql-ver0.1
Updated
daje/chapter5_psychological_chatbots
Updated
daje/20240830_model
Updated
daje/meta-llama3.1-8B-qna-koalpaca-v1.1
Text Generation
•
Updated
•
12
daje/model_output
Updated
daje/chinese_results_20240729_021938
Updated
daje/code-llama3-8B-text-to-sql-ver0.1
Text Generation
•
Updated
•
9
daje/code-llama3-8B-text-to-sql
Updated
datasets
9
daje/Ko-SciecneQA
Viewer
•
Updated
•
12.7k
•
10
daje/keyword_summary
Viewer
•
Updated
•
1k
daje/kotext-to-sql-v1
Viewer
•
Updated
•
262k
•
45
daje/mistral_tokenized_en_wiki
Viewer
•
Updated
•
16.1M
•
94
daje/mistral_tokenized_ko_wiki
Viewer
•
Updated
•
1.7M
•
36
daje/tokenized_enwiki
Viewer
•
Updated
•
16.4M
•
103
daje/tokenized_kowiki
Viewer
•
Updated
•
1.71M
•
42
daje/en_wiki
Viewer
•
Updated
•
5.09M
•
442
daje/ko_wiki
Viewer
•
Updated
•
311k
•
150
•
5