Chapter 4 · 2026
From RAG to Agentic RAG for Faithful Islamic Question Answering
Gagan Bhatia, Hamdy Mubarak, Mustafa Jarrar
Abstract
LLMs are increasingly used for Islamic question answering, where ungrounded responses may carry serious religious consequences. Yet standard MCQ/MRC-style evaluations do not capture key real-world failure modes, notably free-form hallucinations and whether models appropriately abstain when evidence is lacking. To shed a light on this aspect we introduce ISLAMICFAITHQA, a 3,810-item bilingual (Arabic/English) generative benchmark with atomic single-gold answers, which enables direct measurement of hallucination and abstention.
Topics
Agentic RAGIslamic Question AnsweringLLMsHallucinationBenchmarking
Relevance Scores
Long-Horizon Score65
Enterprise Score60
Completeness75
Paper Info
Year2026
Venue
Type
ChapterCh. 4
Authors3
Zone III Analysis
Frameworks
Related Papers
ReAct: Synergizing Reasoning and Acting in Language Mod…
2023 · Ch.1
Toolformer: Language Models Can Teach Themselves to Use…
2023 · Ch.1
HuggingGPT: Solving AI Tasks with ChatGPT and its Frien…
2023 · Ch.4
MemGPT: Towards LLMs as Operating Systems
2023 · Ch.2
View all Chapter 4 papers →