# AI Infra 学习社区 ## Docs - [第一章:介绍与 AI 系统概览](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch01.md) - [第二章:AI 系统硬件概览](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch02.md) - [第三章:GPU 环境下的 OS、Docker 与 Kubernetes 调优](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch03.md) - [第四章:分布式网络通信调优](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch04.md) - [第五章:基于 GPU 的存储 I/O 优化](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch05.md) - [第六章:GPU 架构、CUDA 编程与最大化占用率](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch06.md) - [第七章:GPU 内存访问模式的分析与调优](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch07.md) - [第八章:占用率调优、Warp 效率与指令级并行](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch08.md) - [第九章:提升 CUDA Kernel 效率与算术强度](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch09.md) - [第十章:Kernel 内流水线、Warp 特化与协作线程块集群](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch10.md) - [第十一章:Kernel 间流水线、同步与 CUDA 流有序内存分配](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch11.md) - [第十二章:动态调度、CUDA Graphs 与设备发起的 Kernel 编排](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch12.md) - [第十三章:PyTorch 的分析、调优与扩展](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch13.md) - [第十四章:PyTorch 编译器、OpenAI Triton 与 XLA 后端](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch14.md) - [第十五章:多节点推理、并行、解码与路由优化](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch15.md) - [第十六章:大规模推理的分析、调试与调优](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch16.md) - [第十七章:推理中分离式 Prefill 与 Decode 的扩展](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch17.md) - [第十八章:高级 Prefill-Decode 与 KV Cache 调优](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch18.md) - [第十九章:动态与自适应推理引擎优化](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch19.md) - [第二十章:AI 辅助性能优化与百万级 GPU 集群扩展](https://se7en.mintlify.app/books/ai-systems-performance-engineering/chapters/ch20.md) - [书籍概览](https://se7en.mintlify.app/books/ai-systems-performance-engineering/index.md) - [阅读计划](https://se7en.mintlify.app/books/ai-systems-performance-engineering/schedule.md) - [AI Infra 学习社区](https://se7en.mintlify.app/index.md): 在线分享、读书会、学习资源 — 一起深入 AI Infra 技术栈 - [博客](https://se7en.mintlify.app/resources/blogs.md): AI Infra 领域的推荐博客 - [书籍](https://se7en.mintlify.app/resources/books.md): AI Infra 领域的推荐书籍 - [合集](https://se7en.mintlify.app/resources/collections.md): AI Infra 领域的资源合集与 Awesome Lists - [课程](https://se7en.mintlify.app/resources/courses.md): AI Infra 领域的系统性课程 - [学习资源](https://se7en.mintlify.app/resources/index.md): AI Infra 领域的高质量学习资源 - [论文](https://se7en.mintlify.app/resources/papers.md): AI Infra 领域的核心论文 - [视频](https://se7en.mintlify.app/resources/videos.md): AI Infra 领域的推荐视频资源 - [01 - vLLM 快速入门](https://se7en.mintlify.app/talks/01-vllm-quickstart.md): LLM 全景图介绍与 vLLM 快速入门 - [02 - PagedAttention](https://se7en.mintlify.app/talks/02-pagedattention.md): vLLM 核心技术 PagedAttention 原理详解 - [03 - Prefix Caching](https://se7en.mintlify.app/talks/03-prefix-caching.md): 实现 KV Cache 的跨请求高效复用 - [04 - Speculative Decoding](https://se7en.mintlify.app/talks/04-speculative-decoding.md): Speculative Decoding 推测解码方案详解 - [05 - Chunked prefills](https://se7en.mintlify.app/talks/05-chunked-prefills.md): Chunked-Prefills 分块预填充机制详解 - [06 - PD 分离](https://se7en.mintlify.app/talks/06-disaggregating-prefill-and-decoding.md): PD 分离推理架构详解 - [07 - 推理平台全景](https://se7en.mintlify.app/talks/07-inference-platform.md): 开源推理平台项目介绍 - [在线分享](https://se7en.mintlify.app/talks/index.md)