Topic
information retrieval
Fast LLM-Based Semantic Filtering: Unified Framework and Adaptive Two-Phase Method Deliver 1.6–2.0x Speed Gains
A new research paper from Kim, Catheland, and Ailamaki introduces a unified framework and adaptive two-phase method for LLM-based semantic filtering. By composing model-free clustering and online-trained proxies adaptively, and using oracle confidence for multiple purposes, the method achieves 1.6–2.0x faster performance than prior cascades while meeting a 90% accuracy target on 95% of queries across three 10K-document corpora.
New Framework Distinguishes Entity Relevance Signals for Improved Document Re-Ranking
A new research paper introduces a framework distinguishing Conceptual Entity Relevance (CER) from Observable Entity Relevance (OER), showing that CER and OER have near-chance agreement. Aligning supervision with OER improves non-relevant document pruning by up to 10x and open-world Mean Average Precision by 0.051 over BM25, challenging assumptions in entity-aware retrieval.