Chapter 4 · 2026
ARKV: Adaptive and Resource-Efficient KV Cache Management under Limited Memory Budget for Long-Context Inference in LLMs
J Lei, S Ilager
Abstract
This paper presents ARKV, an adaptive and resource-efficient KV cache management framework for LLM inference under limited memory budgets. It aims to reduce memory usage and maintain high throughput for large context windows.
Topics
KV cache managementresource-efficientlong-context inference
Relevance Scores
Long-Horizon Score65
Enterprise Score60
Completeness75
Paper Info
Year2026
Venue
Type
ChapterCh. 4
Authors2
Zone III Analysis
Related Papers
HuggingGPT: Solving AI Tasks with ChatGPT and its Frien…
2023 · Ch.4
AutoGen: Enabling Next-Gen LLM Applications via Multi-A…
2023 · Ch.4
MetaGPT: Meta Programming for A Multi-Agent Collaborati…
2023 · Ch.4
Communicative Agents for Software Development
2023 · Ch.4
View all Chapter 4 papers →