We propose a RAG architecture that uses hierarchical semantic chunking and graph-based context exclusion to maximize relevant information while minimizing the total volume of retrieved context. The system recursively splits documents into a hierarchical tree structure and dynamically selects the most optimally-sized chunk from each branch by identifying and excluding redundant ancestors and descendants during the search process. This approach ensures a higher relevant-to-total information ratio by retrieving diverse segments from across the document without including overlapping or nested chunks |