Haihao Shen's picture

8 9 35

Haihao Shen

Haihao

·

https://github.com/intel/auto-round

AI & ML interests

LLM quantization, sparsity, and acceleration

Articles

Building Cost-Efficient Enterprise RAG applications with Intel Gaudi 2 and Intel Xeon

Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding

Organizations

Haihao's activity

commented a paper 9 months ago

Efficient Post-training Quantization with FP8 Formats

Paper • 2309.14592 • Published Sep 26, 2023 • 10 •

commented a paper 12 months ago

Effective Quantization for Diffusion Models on CPUs

Paper • 2311.16133 • Published Nov 2, 2023 • 4 •

New activity in Intel/neural-chat-7b-v3-1 12 months ago

Prompt Template?

#1 opened 12 months ago by

commented a paper about 1 year ago

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9 •

New activity in open-llm-leaderboard/open_llm_leaderboard over 1 year ago

Evaluate fine-tuned MPT-7B-Chat model

#98 opened over 1 year ago by