Chenyang Song's picture

8 14 7

Chenyang Song

Raincleared

·

AI & ML interests

None yet

Organizations

Raincleared's activity

upvoted a paper 5 days ago

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Paper • 2411.02335 • Published 5 days ago • 9

upvoted a paper 2 months ago

Configurable Foundation Models: Building LLMs from a Modular Perspective

Paper • 2409.02877 • Published Sep 4 • 27

upvoted 3 papers 5 months ago

Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models

Paper • 2406.15718 • Published Jun 22 • 14

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Paper • 2406.05955 • Published Jun 10 • 22

PowerInfer-2: Fast Large Language Model Inference on a Smartphone

Paper • 2406.06282 • Published Jun 10 • 36

upvoted a collection 5 months ago

MiniCPM

The MiniCPM family of LLMs and VLLMs. • 31 items • Updated 19 days ago • 54

upvoted a collection 7 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 681

upvoted 2 papers 9 months ago

In deep reinforcement learning, a pruned network is a good network

Paper • 2402.12479 • Published Feb 19 • 17

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 111

upvoted 4 papers 10 months ago

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26 • 68

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 42

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 157

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

Paper • 2401.00448 • Published Dec 31, 2023 • 28

upvoted a paper 11 months ago

Beyond Surface: Probing LLaMA Across Scales and Layers

Paper • 2312.04333 • Published Dec 7, 2023 • 18