Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.06158

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 87
NEFTune: Noisy Embeddings Improve Instruction Finetuning

Paper • 2310.05914 • Published Oct 9, 2023 • 14
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 56
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7 • 26

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5 • 40
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 34
GPT-4 Technical Report

Paper • 2303.08774 • Published Mar 15, 2023 • 5
Gemini: A Family of Highly Capable Multimodal Models

Paper • 2312.11805 • Published Dec 19, 2023 • 45

Solver training

Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 18
SymbolicAI: A framework for logic-based approaches combining generative models and solvers

Paper • 2402.00854 • Published Feb 1 • 19
Grandmaster-Level Chess Without Search

Paper • 2402.04494 • Published Feb 7 • 67
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21 • 47

logical reasoning with llms

Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 18
Fusion-Eval: Integrating Evaluators with LLMs

Paper • 2311.09204 • Published Nov 15, 2023 • 5
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation

Paper • 2311.08877 • Published Nov 15, 2023 • 6
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

Paper • 2311.07587 • Published Nov 8, 2023 • 3

Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 18

Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 18

Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 18
MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26 • 49

Ziya2: Data-centric Learning is All LLMs Need

Paper • 2311.03301 • Published Nov 6, 2023 • 16
Co-training and Co-distillation for Quality Improvement and Compression of Language Models

Paper • 2311.02849 • Published Nov 6, 2023 • 3
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning

Paper • 2311.02303 • Published Nov 4, 2023 • 4
ADaPT: As-Needed Decomposition and Planning with Language Models

Paper • 2311.05772 • Published Nov 8, 2023 • 10

Reasoning | Planning

Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

Paper • 2310.18628 • Published Oct 28, 2023 • 7
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

Paper • 2310.19019 • Published Oct 29, 2023 • 9
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs

Paper • 2311.02262 • Published Nov 3, 2023 • 10
Thread of Thought Unraveling Chaotic Contexts

Paper • 2311.08734 • Published Nov 15, 2023 • 6

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs