Krishna Kaasyap's picture

Krishna Kaasyap

KrishnaKaasyap

·

AI & ML interests

None yet

Organizations

KrishnaKaasyap's activity

upvoted a collection 28 days ago

Llama-3.1-Nemotron-70B

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 28 days ago • 132

upvoted a paper 3 months ago

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22 • 50

upvoted 3 collections 3 months ago

Jamba-1.5

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 80

Magnum v2 123b

3 items • Updated Aug 21 • 6

DeepSeek-V2

7 items • Updated Sep 5 • 15

upvoted an article 3 months ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23

• 214

upvoted a paper 3 months ago

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9 • 46

upvoted a collection 3 months ago

Llama-3.1 Quantization

Neural Magic quantized Llama-3.1 models • 21 items • Updated Sep 26 • 39

upvoted 2 articles 4 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29

• 238

Article

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

By

•

Jul 27

• 23

upvoted a collection 4 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 615

upvoted a paper 4 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24 • 57

upvoted 4 collections 5 months ago

SSMs

A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated Oct 1 • 26

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 10 days ago • 157

Yi-1.5 (2024/05)

10 items • Updated May 20 • 90

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 346

upvoted an article 6 months ago

Article

Merge Large Language Models with mergekit

By

•

Jan 9

• 79

upvoted a collection 6 months ago

Llama3-ChatQA-1.5

Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated Oct 1 • 41

upvoted 2 collections 7 months ago

LLaVA-Llama-3-8B

8 items • Updated Apr 28 • 18

Arctic

A collection of pre-trained dense-MoE Hybrid transformer models • 2 items • Updated Apr 24 • 23