Passage, Sentence, or Proposition? An Empirical Comparison of Retrieval Granularity Effects on LLM Answer Accuracy in Retrieval-Augmented Generation

Xu Wang; Xuanyi Fu; Danbing Zou

doi:10.66372/JGER.v3i1.6

Authors

Xu Wang Computer Science, Beijing University of Posts and Telecommunications, Beijing, China Author
Xuanyi Fu M.S.E. in Computer Science,Johns Hopkins University,MD,USA Author
Danbing Zou Computer Science and Technology, Wuhan University, Wuhan, China Author

DOI:

https://doi.org/10.66372/JGER.v3i1.6

Keywords:

retrieval-augmented generation, retrieval granularity, open-domain question answering, large language models

Abstract

Retrieval-Augmented Generation (RAG) has become a dominant paradigm for grounding large language model (LLM) outputs in external knowledge. While extensive research has focused on retriever architectures and generation strategies, the choice of retrieval granularity—the textual unit indexed and retrieved—remains insufficiently studied. This paper presents a controlled empirical comparison of four retrieval granularity levels: document, passage (100-word window), sentence, and proposition. Experiments are conducted across three open-domain question answering benchmarks (Natural Questions, TriviaQA, and HotpotQA) using two representative dense retrievers (DPR and Contriever) paired with LLaMA-2-7B-Chat as the reader. Results indicate that finer-grained retrieval units consistently improve retrieval recall, with proposition-level indexing achieving up to 6.8 absolute points higher Recall@20 than passage-level on Natural Questions under DPR. End-to-end answer accuracy follows a similar trend for single-hop factoid questions, where proposition-level retrieval yields the highest Exact Match scores. On multi-hop questions in HotpotQA, this advantage diminishes and passage-level retrieval produces comparable or slightly superior accuracy, suggesting that broader contextual units are beneficial when reasoning across multiple evidence pieces. These findings provide practical guidance for RAG pipeline design: retrieval granularity should be selected in accordance with question complexity, and no single granularity level dominates across all conditions.

Author Biography

Danbing Zou, Computer Science and Technology, Wuhan University, Wuhan, China

Passage, Sentence, or Proposition? An Empirical Comparison of Retrieval Granularity Effects on LLM Answer Accuracy in Retrieval-Augmented Generation

Authors

DOI:

Keywords:

Abstract

Author Biography

Downloads

Published

Issue

Section

License

How to Cite

Manu

For Authors

About Journal

Editorial Team

Make a Submission

Ready to Publish