Topic
large language models
DynaDebate: Dynamic Path Generation Breaks Homogeneity in Multi-Agent AI Debates
A new research paper introduces DynaDebate, a framework that solves the homogeneity problem in multi-agent AI debates by dynamically generating diverse reasoning paths, shifting to step-by-step logic critique, and activating a verification agent to resolve disagreements. Experiments show superior performance across most benchmarks.
Cordyceps: New Data Poisoning Attack Covertly Controls Large Language Models
A new paper on arXiv presents Cordyceps, a data poisoning attack that embeds covert control instructions into large language models through semantic associations. Tested across five LLMs, it achieves up to 93% attack success after backdoor defenses and 98% after prompt injection defenses, outperforming heuristic methods by 40%.
Haiku to Opus in Just 10 bits: LLMs Unlock Large Compression Gains
A new arXiv paper presents methods for compressing LLM-generated text, achieving over 100x reduction in data transfer compared to prior techniques. Lossless compression via domain-adapted LoRA adapters doubles efficiency, while an interactive Question-Asking protocol recovers up to 72% of the capability gap between small and large models using only 10 binary questions.
How Scale Design Impacts LLM Metacognition and Enterprise AI Reliability
A study on arXiv reveals that the confidence scale used in LLMs (typically 0-100) leads to heavy discretization, with over 78% of responses on three round numbers. Changing the scale to 0-20 improves metacognitive efficiency. The findings have implications for enterprise use of LLMs in supply chain decision-making where confidence calibration is critical.
CircuitLasso Enables Scalable Interpretability for Large Language Models at Lower Cost
A new approach called CircuitLasso uses sparse linear regression to learn interpretable circuits in large language models. It achieves structural accuracy comparable to intervention-based methods on benchmark data while dramatically reducing computational cost. The method also reveals relationships among sparse autoencoder features, aiding understanding of how semantic features propagate through models.
From Detection to Recovery: Operational Analysis of LLM Pre-training on 504 NVIDIA B200 GPUs
A new paper presents an empirical operational analysis of a 504-GPU NVIDIA B200 cluster used for LLM pre-training. Analyzing 55 days of Prometheus metrics and 73 days of logs across 224 sessions, the study reveals that no single metric predicts all GPU failures, checkpoint I/O saturates NFS bandwidth, node failures are concentrated on a few systems, and automated retry chains achieve 33.3% success rate vs 12.5% manual.
SDS-LoRA: New Low-Rank Adaptation Method Fixes Gradient Distortion in Large Model Fine-Tuning
A new paper on arXiv introduces SDS-LoRA, a low-rank parameterization that overcomes anisotropic gradient scaling in LoRA. By structurally decoupling singular values from the backward pass, SDS-LoRA ensures gradients are only applied through orthonormal bases, improving convergence and reducing the performance gap to full fine-tuning. Experimental results across natural language and vision benchmarks show enhanced adaptation performance.
MA-ProofBench: New Benchmark Tests LLMs on Formal Theorem Proving in Mathematical Analysis
Researchers introduce MA-ProofBench, the first formal theorem-proving benchmark dedicated to mathematical analysis. It contains 200 theorems across six topics at two difficulty levels. Evaluations show that even the best model, GPT-5.5, achieves only 16% Pass@8 on undergraduate-level problems and 5% on Ph.D.-level problems, highlighting significant limitations of current LLMs in formal mathematical reasoning.
AuAu Benchmark Audits Authoritarian Alignment in Large Language Models from Four Regions
Researchers introduce AuAu, a benchmark to assess authoritarian alignment in LLMs using psychometric tests, vignettes, and user prompts. Testing 17 models from China, EU, Russia, and USA revealed substantial authoritarian response rates and easy manipulation via system prompts.
Graphical-Probabilistic Modeling Brings Rigor to LLM-Native Software Engineering
Current LLM-native software development relies on experimentation and heuristics. A proposed framework called Generation Networks uses graphical probabilistic models to document generative flows and enable design-level reasoning, bringing the rigor of traditional software engineering to LLM systems.
New Unified Definition of AI Hallucination Pins It on Inaccurate World Modeling
A new arXiv paper by Liu et al. proposes a unified definition of hallucination in large language models, defining it as inaccurate internal world modeling observable to the user. The framework subsumes prior definitions and distinguishes true hallucinations from planning or reward errors, and introduces the HalluWorld benchmark for stress-testing models.
LLM Manuscript Scoring System Validated Against Peer-Review Outcomes at Major AI Conference
Researchers validate AIPR, an LLM-based manuscript scoring system, against 300 ICLR submissions. The system achieves an AUROC of 0.82 in separating accepted from rejected papers and shows low score variability, offering a reliable first-pass assessment tool.
New Research Defends LLMs from Extraction Attacks Using 'Knowledge Trap' Honeypot
A research paper by Dai and Dong introduces Knowledge Trap, a defense against large language model extraction attacks. It uses a Honeypot Knowledge Graph to redirect attackers' queries to low-value knowledge, reducing surrogate agreement by 6.2% on average while preserving legitimate user performance.
New Diagnostic for Language-Driven Bandits Determines When Lightweight Models Beat LLMs
A new paper proposes LLMP-UCB, a bandit algorithm that uses repeated LLM inference for uncertainty estimates, but finds that lightweight numerical bandits on text embeddings often match or exceed LLM accuracy at lower cost. The authors also introduce a geometric diagnostic to guide when to use LLMs versus simpler models, offering a cost-performance tradeoff framework for AI decision systems.
Self-Consistency Reranking Boosts Accuracy in Narrative Question Answering for Enterprise AI
Researchers propose a self-consistency-based reranking framework for narrative question answering that generates multiple candidates and selects the final answer by semantic agreement. On the NarrativeQA dataset, FLAN-T5-Base improved from 82.32% to 86.66%, and Pegasus-Large jumped from 72.50% to 87.07%. The method requires no architectural changes, making it a drop-in enhancement for enterprise language models.
Metacognitive Myopia in LLMs: New Framework Reveals Hidden Biases with High-Stakes Implications
Researchers propose metacognitive myopia as a cognitive-ecological framework to explain a range of biases in large language models (LLMs), including reinforcement of stereotypes and flawed decision-making. The framework identifies five specific symptoms and suggests technical approximations of metacognitive monitoring and control to mitigate risks. The study raises significant ethical concerns for deploying LLMs in organizational structures and high-stakes domains such as supply chain and trade.
LLM-WikiRace Benchmark Reveals Frontier AI Models Still Struggle with Planning Over Knowledge Graphs
Researchers introduced LLM-WikiRace, a benchmark to evaluate large language models on planning, reasoning, and world knowledge using Wikipedia hyperlinks. Top models like Gemini-3, GPT-5, and Claude Opus 4.5 achieve superhuman performance on easy tasks but drop sharply on hard difficulty, with Gemini-3 succeeding in only 23% of hard games. The study reveals that world knowledge helps only up to a point; beyond that, planning and long-horizon reasoning are the limiting factors.
Deep Residual Injection Method Enables Full-Spectrum Forensic AI Detection in Multimodal Models
Researchers propose Deep Visual Residual MLLM (Deep-VRM), a method that injects low-level artifact signals into multimodal large language models without disrupting pre-trained semantic knowledge. The approach achieves state-of-the-art detection of AI-generated images across multiple benchmarks.
DYNA Framework Uses Temporal Knowledge Graphs to Reduce LLM Forgetting Without Retraining
Researchers propose DYNA, a lightweight framework that connects frozen large language models (LLMs) to a temporal knowledge graph, enabling continuous learning without costly retraining. On three temporal recall tasks, DYNA reduces catastrophic forgetting by ~7% compared to fine-tuning and improves temporal ordering by ~5% over standard retrieval-augmented generation (RAG). The paper also finds that higher graph clustering coefficients correlate with better retrieval, indicating the importance of graph structure.
SPARK Method Activates Latent Security Knowledge in LLMs for Secure Code Generation
SPARK (Security Knowledge Priming and Representation-Guided Knowledge Activation) is a new inference-time method that improves the security of code generated by large language models without requiring retraining. The researchers argue that pretraining data already contains sufficient security material; the bottleneck is activation. Evaluated on 9 open-source and 7 proprietary models, SPARK matches or improves secure code generation baselines while preserving code utility.
New Defense Keeps Attack Success Rate Below 4% for Adaptive Prompt Injection on LLM Agents
Researchers propose RETA, a training-based defense that grounds LLM agent security on user tasks rather than attack patterns. Using chain-of-thought reasoning and red-teaming with diversity reward, RETA keeps average attack success rate below 4% across six adaptive attacks while preserving utility.
Service-Induced Congestion Threatens LLM Serving Throughput, New Model Shows
A new mathematical model from researchers at MIT and elsewhere shows that in large language model serving, persistent GPU memory consumption from key-value caches creates a 'service-induced congestion' effect. Under high concurrency, this can lead to instability and throughput losses as high as 50%. The paper identifies scheduling design principles to avoid these losses.
New Attack FragFuse Exploits LLM Agent Memory to Bypass Access Controls
Researchers introduce FragFuse, a novel attack that bypasses access control in large language model agents by fragmenting prohibited queries across interactions and storing them in long-term memory, later reconstructing them without triggering defenses. The attack achieves an 86.3% average bypass success rate across multiple agent settings and exposes a critical vulnerability in memory-based AI systems.
Technology ‘Pretty Crazy’ Token Usage Tests Enterprise AI Bets as Companies Balance Costs and Gains
Enterprise adoption of generative AI is driving a new focus on 'tokenomics'—managing the surging cost of tokens. While companies like 8x8 report $5M in savings from Claude, others like Cisco and RBC see token usage skyrocketing. Baseball Lifestyle 101 used Claude to land a $1M order via supply chain insights.
Tensor-Coord: Algebraic Decomposition Enables Conflict-Free Multi-Agent LLM Planning
A new research paper introduces Tensor-Coord, a multilinear algebra framework that represents joint plans of multiple LLM agents as a third-order tensor. By decomposing the tensor, it identifies coordination conflicts and enables iterative replanning, achieving 100% conflict-free plans for 2-agent tasks and 80% for 3-agent tasks in simulated delivery scenarios.
UrbanWell Benchmark Puts Multimodal LLMs to Test on Spatio-Temporal Urban Wellbeing Analytics
Researchers introduce UrbanWell, a large-scale benchmark for evaluating multimodal large language models on spatio-temporal urban wellbeing analytics. The benchmark covers 38 cities, multiple years, and diverse indicators including environment, accessibility, urban form, vitality, and subjective perception. Testing 15 state-of-the-art MLLMs in zero-shot settings reveals substantial performance variations across heterogeneous indicators.
MAF Framework Dynamically Optimizes Prompting for Multimodal Sentiment Analysis
A new research paper proposes the Multimodal Adaptive Few-Shot Prompting (MAF) framework, which improves sentiment analysis in multimodal large language models (MLLMs) by dynamically retrieving and integrating query-relevant demonstrations. The method uses a lightweight coefficient network to fuse multimodal similarity scores and enhances prediction stability via majority voting.
SCAN Framework Helps CTOs Decide When to Use Generative AI for Task Allocation
A new academic paper introduces SCAN, a decision-making framework for task allocation with generative AI. Based on Vygotsky's Zone of Proximal Development and Metacognition, SCAN defines four sub-zones—Substitute, Complement, Aid, Non-negotiable—to guide knowledge workers and students in effectively using GenAI. The framework also addresses cognitive load, cognitive offloading, sycophancy, and the future of work.
LLaMA 3.1's Ethical Reasoning Reveals Frame-Conditioned Moral Computation, Researchers Find
A mechanistic interpretability audit of Meta's LLaMA 3.1-8B-Instruct on 54 moral prompts reveals that the model's ethical reasoning is highly sensitive to surface features of the prompt, a phenomenon called Frame-Conditioned Moral Computation. The study, using the Transluce platform, found domain-specific representations dominate activation lists and that RLHF may re-order surface text without removing underlying biases. The authors call for a new research program, Mechanistic Alignment, to supplement behavioral alignment.
ChatPlanner: LLM Framework Personalizes Public Transit Routing with Fine-Tuning and RAG
Researchers present ChatPlanner, a novel framework that leverages fine-tuned Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to capture diverse user preferences for public transit routing. The system extracts routing parameters from natural language queries, integrates preferences into the routing algorithm, and generates feasible, personalized alternatives. Three experiments show that the combined fine-tuning and RAG approach achieves highest accuracy and uncovers valuable solutions overlooked by existing route planners.
SpecAlign Framework Uses Synthetic Data to Align Large Language Models with Specific Policies
A research paper introduces SpecAlign, a framework that generates synthetic training data from provider-authored model specifications to align large language models with specific policies. The method combines structured rule annotation, controllable instantiation, and multi-agent adversarial data synthesis to create preference pairs for fine-tuning. Experiments show improved rule compliance without sacrificing general capabilities.
New Framework Automates Skill Construction for Agentic Large Language Models
A new framework called Collective Skill Tree Search (CSTS) automatically constructs reusable skills for large language model (LLM) agents. It uses two iterative phases—collective generation and collective assessment—to build a diverse, generalizable tree of skills that enhances agentic capabilities in planning, tool use, and environment interaction.
Latent Thought Flow: Efficient Reasoning in LLMs Cuts Cost and Boosts Accuracy
Researchers propose Latent Thought Flow (LTF), a method that models LLM reasoning as continuous trajectories in latent space, using GFlowNet and entropy-weighted objectives. LTF outperforms explicit Chain-of-Thought and latent reasoning baselines, achieving 9.5% higher accuracy while cutting reasoning length by 27.2%, addressing the linguistic bottleneck that inflates inference costs.
Think-at-Hard: Selective Latent Iterations Boost LLM Reasoning Accuracy by Up to 6.8%
A new research paper proposes Think-at-Hard (TaH), a looped transformer that selectively performs latent iterations only on tokens likely to be incorrect. By skipping iterations on 93% of tokens, TaH outperforms always-iterate models by 3.8-4.4% and single-iteration baselines by up to 6.8%, while requiring negligible extra parameters.
Skill-to-LoRA: Replacing Runtime Skill Text with Trainable Adapters for Token-Efficient LLM Agents
Researchers propose Skill-to-LoRA (S2L), a technique that converts procedural agent skills from runtime text into trainable LoRA adapters. Evaluated on Qwen3.6-27B, S2L improves pass rate by up to 5.2 percentage points and reduces per-step token cost by 6.6% compared to full skill text prompting.
StateGen Platform Generates Synthetic Training Data for Tool-Augmented LLMs with 9.66/10 Hallucination Score
Researchers introduce StateGen, a synthetic data generation platform that produces scored, reasoning-trace-rich training conversations for tool-augmented LLMs. The platform uses a four-role LLM loop and an authoritative state manager to eliminate tool-call hallucinations, achieving a 9.66/10 score across 64,698 evaluated conversations.
Philosophy Paper Argues Large Language Models Lack Agency for Moral Responsibility
A recent academic paper from arXiv argues that attributing agency or moral responsibility to large language models (LLMs) is misguided. The paper maintains that LLMs produce coherent outputs but are fully characterized by probabilistic input-output mappings, lacking intrinsic intentionality and self-attributed action. This challenges claims that LLMs can be moral agents, with direct relevance to how enterprises govern AI in decision-making.
AgentLeak Benchmark Reveals Internal Channel Privacy Leaks in Multi-Agent LLM Systems
A new benchmark called AgentLeak evaluates privacy leakage in multi-agent large language model (LLM) systems, finding that inter-agent messages leak at 68.8% compared to 27.2% for final outputs. Across 1,000 scenarios and five models, total system exposure reaches 68.9%, highlighting risks invisible to standard output-only audits.
New ASRD Method Boosts Diffusion LLM Accuracy by 6.4% and Inference Speed by 7.2×
Researchers propose ASRD (Anchor Supervised Revocable Decoding), a training-free framework that improves diffusion LLM accuracy by up to 6.4% and accelerates inference throughput by up to 7.2×. ASRD addresses error propagation and local error reinforcement in revocable decoding by introducing anchor tokens and two complementary mechanisms.
Training-Free Framework Uses XAI and Multimodal LLMs to Generate Grounded Explanations for Speech Deepfake Detection
Researchers propose a training-free explanation framework that integrates XAI evidence with multimodal large language models to generate grounded and specific explanations for speech deepfake detection. Using the PartialSpoof dataset, the method increases inside accuracy by over 45%, verified through human evaluation and faithfulness checks.
Few-Shot Biomedical Relation Extraction with LLMs: A Viable Alternative to Supervised Learning?
A new study on arXiv investigates few-shot biomedical relation extraction using large language models (LLMs). The best model achieved micro-F1 of 0.44, surpassing prior few-shot results but below supervised baseline. However, on macro-F1, prompt-based methods outperformed supervised learning, particularly on rare relation types, highlighting LLMs' potential in low-resource settings.
New Definition of Good Explanations Highlights Challenges in Explaining LLM Outputs
A recent arXiv paper by Mahon, Louis, Ford, Elliot, Hackett, and Callum proposes a definition of good explanations inspired by counterfactual explanations but incorporating the interlocutor's prior beliefs. The authors explore the ramifications for AI explainability, particularly why LLM outputs are difficult to explain well.