Generating personalized reading material that matches both a user's topic interest and their desired complexity level remains a challenge for content recommendation systems. A new research paper from Sooyeon Kim and Piotr S. Maciąg, posted on arXiv, presents a system that combines Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs) to address this problem. The architecture demonstrates that grounding LLM output with real-time web retrieval can significantly boost the quality and relevance of generated reading passages.
System Architecture: Four Modules for Personalized Content
The proposed system is built around four modules: Input, RAG, Generation, and Judging. Users provide a question and specify a target reading complexity level. The RAG module retrieves relevant information from the Internet to enrich and ground the content produced by three modern LLMs: Meta LLaMA 4 Scout, LLaMA 3.1 8B Instant, and Google Gemma2 9B. The Generation module employs three prompting strategies — Chain-of-Thought, zero-shot, and few-shot — to create reading materials. Finally, a LLM-as-a-Judge module automatically evaluates answer quality and alignment with the desired readability level.
| Module | Function |
|---|---|
| Input | Accepts user query and target complexity |
| RAG | Retrieves relevant information from the Internet |
| Generation | Uses LLMs with prompting strategies to produce content |
| Judging | LLM-as-a-Judge evaluates quality and readability alignment |
Experimental Results: RAG Drives Measurable Improvements
The researchers conducted experiments to evaluate the system's performance. According to the arXiv paper, RAG consistently improved system performance across all models and prompting techniques. Specifically, RAG increased relevance and particularly groundedness by 26 to 35 percentage points. The paper notes that the RAG-augmented architecture effectively produces reading content tailored to user queries and desired textual complexity.
| Model | Prompting Strategy | RAG Improvement (Groundedness) |
|---|---|---|
| Meta LLaMA 4 Scout | Zero-shot | +26–35 pp |
| LLaMA 3.1 8B Instant | Few-shot | +26–35 pp |
| Google Gemma2 9B | Chain-of-Thought | +26–35 pp |
Implications for Enterprise Content Systems
For technology leaders evaluating AI-driven personalization, this research demonstrates a practical architecture that combines retrieval and generation. The use of a Judging module to automatically verify content alignment with readability targets offers a path toward automated quality assurance in content generation. While the study focuses on reading recommendations, the same architecture — RAG with LLMs and a quality checker — could be adapted for other domains such as technical documentation, training materials, or compliance communications.
As LLMs like Meta LLaMA and Google Gemma become more accessible, the ability to ground their output in real-time retrieved data becomes critical for enterprise adoption where accuracy and relevance are paramount. The 26–35 percentage point improvement in groundedness reported in the paper underscores the value of integrating retrieval mechanisms before generation.
Technical Stack and Open Questions
The system uses openly available LLMs and standard RAG techniques. The paper does not specify a particular retrieval database or vector store, but it notes that RAG retrieves information from the Internet. The three prompting strategies — Chain-of-Thought, zero-shot, and few-shot — are widely used in the LLM community. The LLM-as-a-Judge module automatically scores outputs, reducing the need for human evaluation in development cycles.
For enterprise buyers, key considerations include the latency of web retrieval, the cost of running multiple LLMs, and the accuracy of the Judge module. The paper does not provide latency or cost figures. Nonetheless, the architecture offers a template for building content recommendation systems that adapt to both topic and complexity preferences — a capability sought after in e-learning, knowledge management, and customer-facing support portals.