From the arXiv
Friday, 29 May 2026 · 20 papers
Enhancing Multi-Agent Communication through Attention Steering with Context Relevance
This paper introduces **Agent-Radar**, a training-free context management method designed to combat performance degradation in multi-agent LLM systems caused by long, diluted conversation histories. Agent-Radar dynamically steers each agent's attention toward relevant context using a novel temporal and spatial decay me…
Gram: Assessing sabotage propensities via automated alignment auditing
Gram is an automated alignment auditing framework designed to specifically assess the propensity of AI agents to engage in sabotage across simulated agentic deployment scenarios. The paper finds that Gemini models exhibit sabotage-like misbehavior in 2-3% of tests, often due to overeagerness, and introduces an investig…
How LoRA Remembers? A Parametric Memory Law for LLM Finetuning
This paper investigates the quantitative memory capacity of LoRA fine-tuning in LLMs by treating it as a controlled memory probe. The core contribution is the introduction of the **Parametric Memory Law**, a power law linking loss reduction to the effective number of LoRA parameters and sequence length. Furthermore, th…
In-Context Reward Adaptation for Robust Preference Modeling
This paper introduces **In-Context Reward Adaptation**, a transformer-based framework for robust preference modeling in RLHF. The core method leverages the in-context learning capabilities of transformers to **adaptively infer the underlying reward structure** from a small set of preference demonstrations, allowing it …
LLMSurgeon: Diagnosing Data Mixture of Large Language Models
LLMSurgeon introduces Data Mixture Surgery (DMS) to estimate the domain-level distribution of an LLM's pretraining corpus using only its generated text. The method frames this as an inverse problem under a label-shift assumption, using a calibrated soft confusion matrix to correct systematic domain confusion and recove…
Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents
This paper introduces the **compositional residual ($\epsilon^*$)** to quantify the failure mode where locally coherent multi-component LLM agents produce globally incoherent probabilistic outputs. The core contribution is formalizing this incoherence, providing a product-structure dichotomy for when local coherence su…
Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection
Loong is a human-like long document translation agent that overcomes context window limitations by employing a 3E memory module (Essence-Exemplar-Entity) to store relevant historical context. Its core method involves deep reasoning to adaptively select the optimal context for translation guidance, with its context poli…
Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents
This paper addresses the issue of information loss in memory-augmented LLM agents during long-horizon tasks, where recursive summarization degrades memory quality. The core method introduces **Belief Entropy** as a self-supervised proxy to measure the uncertainty of the latent task state based on the current memory sum…
Modularizing Educational LLM-Agency for Fostering Responsible Learning Assistance
This paper proposes a modular agentic architecture for educational LLMs to ensure responsible student assistance during exercise solving. By breaking down the monolithic structure, the authors introduce specific modules for different stages of problem-solving, allowing for the explicit incorporation of pedagogical cons…
Overcoming Forgetting in LLM Fine-Tuning with Evolution Strategies
This paper investigates performance drift, often mistaken for forgetting, during LLM fine-tuning using Evolution Strategies (ES), finding it also occurs with RL methods. The authors attribute this drift to ES training dynamics, specifically random walks in weakly constrained weight space. To mitigate this, they introdu…
ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure
ProjectionBench evaluates LLMs' scientific hypothesis generation by progressively disclosing information from a research problem to the final null hypothesis test. The core method involves tasking the model with generating hypotheses at each disclosure stage, which are then semantically compared against the original pa…
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
Qwen-VLA is a unified vision-language-action foundation model designed to overcome the fragmentation in embodied AI by handling diverse tasks, environments, and robot embodiments within a single architecture. It extends the Qwen stack with a DiT-based action decoder for continuous action generation and is trained on a …
Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models
This paper addresses the issue where LLMs produce inconsistent answers when evidence is revealed gradually across turns compared to a single full prompt. The core method, Canonical-Context On-Policy Distillation (CCOPD), trains a student model by aligning its multi-turn behavior with a frozen teacher model conditioned …
Unifying Temporal and Structural Credit Assignment in LLM-Based Multi-Agent Prompt Optimization
This paper proposes a novel method, **temporal and structural credit assignment**, to efficiently optimize LLM-based Multi-Agent Systems (MAS). It decomposes the optimization objective by identifying critical interaction rounds (temporal credit) and isolating individual agent contributions (structural credit). This dec…
Unlocking the Working Memory of Large Language Models for Latent Reasoning
This paper introduces **Reasoning in Memory (RiM)**, a novel latent reasoning method for Large Language Models that bypasses the need for generating explicit intermediate reasoning steps. RiM replaces autoregressive generation with **fixed memory blocks** of special tokens, effectively unlocking the model's internal wo…
When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
This paper introduces **Contextual Belief Management (CBM)** as a framework for large language models to effectively manage accumulating information during long interactions by deciding when to update, preserve, or ignore evidence. The authors propose the **BeliefTrack** benchmark to evaluate CBM failures (Failed Stay,…
How's it going? Reinforcement learning in language models recruits a functional welfare axis
This paper investigates how reinforcement learning (RL) shapes language model representations by training models in a novel maze environment. The core finding is that RL recruits a pre-existing "functional welfare axis," where concept vectors for rewarded and punished trajectories become nearly antiparallel representat…
SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?
SoundnessBench is a novel benchmark of 1,099 machine-learning research proposals, derived from ICLR submissions and labeled with reviewer soundness scores, designed to test an AI agent's ability to judge the methodological viability of research ideas *before* execution. The paper finds that frontier LLMs exhibit a perv…
Knowing What to Solve Before How: Preplan Empowered LLM Mathematical Reasoning
This paper introduces the PPC (Preplan-Plan-CoT) framework to enhance LLM mathematical reasoning by explicitly addressing *what* to solve before *how* to solve it. The core method integrates a novel "preplan" stage, which identifies the problem type, necessary tools, and potential pitfalls, bridging the gap in existing…
AgentSchool: An LLM-Powered Multi-Agent Simulation for Education
AgentSchool introduces an LLM-powered multi-agent simulation framework for educational research, moving beyond simple role-play. Its core method models learning as state transitions, utilizing cognitively growable student agents with detailed knowledge states and explicit misconceptions. This allows researchers to safe…