AI智能总结
戴金权(Jason Dai)英特尔院士 自回归大语言模型(基于Transformer解码器架构) Transformer解码器架构 Transformer解码器架构 推理(下一个token/Decode) 大语言模型推理和训练瓶颈 ▪内存带宽 ▪计算 ▪显存大小 大模型的异构计算和加速 ▪XPU异构计算•CPU, GPU, NPU硬件加速 大模型的异构计算和加速 ▪低比特计算 •模型量化/压缩(WxAy)•数据类型(INTx,FPx)•低比特算子•显存(如kvcache)使用量•训练、微调(如QLoRA) 低比特大模型的精度困惑度(Wikitext数据集) 大模型的异构计算和加速 ▪推理算法优化 •Self-speculative decoding•KV Cache compression•Sliding window attention•Sparse attention•Flash attention/decoding•Continuous batching•Prefill/decoding disaggregation IPEX-LLM:开源大模型XPU加速框架 Users/Developers Python (PyTorch) Ecosystem llama.cpp Ecosystem llama.cpp,Ollama,LangChain.js,OpenWebUI,… HuggingFace,Langchain,LlamaIndex,DeepSpeed,TRL,Axolotl,… IPEX-LLM Library 英特尔XPU大模型加速体验 Intel UHD/IrisiGPUllama.cpp + IPEX-LLM (Phi-3-mini, Q4_0) Intel Core Ultra AI PCOllama+ IPEX-LLM (Mistral-7B, Q4_K_M) Intel Arc A770 GPUTextGeneration-WebUI+ IPEX-LLM (Llama3-8B, FP8) 4 x Arc A770 GPUFastChat+ IPEX-LLM (QWen1.5-72B FP6) LoRA/QLoRAonXeon+Multi-Arc支持PEFT, TRL, Axolotl, Zero2/Zero3 英特尔XPU大模型应用创新 Office助手ExtendOffice展示 工业机器人代码生成科东软件展示 AI座舱-汽车助理智谱AI展示 AI座舱-驾驶伴侣百川智能展示 个人或企业本地RAG系统 在英特尔XPU上运行GraphRAG(https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/graphrag_quickstart.md) 在英特尔XPU上运行RAGFlow(https://github.com/intel-analytics/ipex- llm/blob/main/docs/mddocs/Quickstart/ragflow_quickstart.md) Call to Actions •关注和试用IPEX-LLM,并给我们反馈•https://github.com/intel-analytics/ipex-llm/ •使用IPEX-LLM在Intel XPU平台开发大模型及其应用•客户端-边缘-服务器(Intel Core Ultra AI PC、AI座舱、Xeon+IntelArc GPUs)•高效的大模型XPU加速的创新•大模型应用场景的创新 谢谢! Notices &Disclaimers Performance varies by use, configuration and other factors. Learn more on the Performance Index site. Performance results are based on testing as of dates shown in configurations and may not reflect allpublicly available updates.See backup for configuration details.No product or component can beabsolutely secure. Your costs and results may vary. Intel technologies may require enabled hardware, software or service activation. © Intel Corporation.Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or itssubsidiaries.Other names and brands may be claimed as the property of others.