From the arXiv
Monday, 25 May 2026 · 20 papers
AI Assurance: A Comprehensive Testing Strategy for Enterprise AI Systems
This paper proposes a comprehensive AI assurance strategy for enterprise AI systems, shifting focus from classical verification to continuous risk reduction. The core method involves treating evaluation as a core engineering discipline, structured around a new AI Failure Taxonomy and a five-layer AI Assurance Pyramid. …
Beyond Binary Edits Robust Multimodal Knowledge Editing with Adversarial Subspace Alignment
This paper introduces Latent Adversarial Robustification (LAR) to improve the generality of intrinsic multimodal knowledge editing in MLLMs. LAR generates adversarial, semantically coherent variants in the latent space to expose fragile editing regions, ensuring that knowledge updates generalize across semantically equ…
DiLaDiff: Distilled Latent-Augmented Diffusion for Language Modeling
DiLaDiff addresses the token correlation issue in diffusion language models by introducing a continuous, semantically rich latent space learned via an autoencoder. This latent space guides a diffusion model, and a subsequent consistency model distills this process into a fast, few-step latent generator. The core contri…
From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills
This paper systematically studies the full lifecycle of model-generated agent skills, spanning experience generation, extraction, and consumption. The core contribution is a utility-grounded evaluation framework applied across five diverse domains to determine when and why these skills succeed or fail. The study finds …
It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt
This paper demonstrates that geopolitical bias in LLMs primarily originates during the **post-training (fine-tuning/alignment) phase**, contrary to common assumptions about pre-training data. The authors found that models consistently develop biases favoring the region of their developer after post-training, and the ma…
LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws
This paper introduces the **Shannon Scaling Law**, modeling LLM training as information transmission over a noisy channel, mapping parameters to bandwidth and data to signal power. This framework explains non-monotonic scaling phenomena like catastrophic forgetting by identifying a fundamental **Shannon capacity**. The…
MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection
MemAudit is a post-hoc auditing framework designed to identify malicious memories injected into LLM agents' persistent storage. It combines a counterfactual memory influence score to measure each memory's causal contribution to harmful outputs with a memory consistency graph to detect structural anomalies indicative of…
SkillOpt: Executive Strategy for Self-Evolving Agent Skills
SkillOpt introduces a novel method to systematically optimize agent skills by treating the skill itself as an external, trainable state, analogous to weight optimization in deep learning. It employs a dedicated optimizer model to generate bounded, text-based edits (add/delete/replace) to the skill document, accepting o…
Push Your Agent: Measuring and Enforcing Quantitative Goal Persistence in Long-Horizon LLM Agents
This paper introduces **Quantitative Goal Persistence (QGP)**, a metric to measure whether long-horizon LLM agents continue working until an external verifier confirms a specific count of distinct, valid items is achieved. The authors propose **PushBench**, a benchmark focused on artifact collection, to directly measur…
Strong Teacher Not Needed? On Distillation in LLM Pretraining
This paper investigates the conventional assumption that stronger teachers are necessary for effective knowledge distillation during Large Language Model (LLM) pretraining. The authors demonstrate that even small, undertrained "teachers" can successfully improve larger "students" when the language modeling and distilla…
ARES: Automated Rubric Synthesis for Scalable LLM Reinforcement Learning
ARES is a framework that automates the creation of question-answer pairs and corresponding question-specific weighted rubrics from raw pretraining documents. This enables scalable reinforcement learning for LLMs by providing instance-level reward supervision for open-ended responses, overcoming the limitations of manua…
OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents
OpenSkillEval is an automatic evaluation framework designed to audit the rapidly expanding ecosystem of skills used by LLM agents. It addresses the lack of clarity regarding skill quality and model interaction by automatically constructing realistic task instances across five application domains. The framework's core c…
Agentic Proving for Program Verification
This paper investigates the capability of agentic AI systems, specifically Claude Code, for program verification using the CLEVER benchmark in Lean 4. The core method involves evaluating the agent's performance across specification generation, implementation certification against ground truth, and end-to-end verificati…
Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents
Co-ReAct introduces a framework where external rubrics act as step-level collaborators to guide ReAct agents during inference, moving beyond their typical role as post-hoc evaluators. By injecting the rubric into the agent's context at each decision point, Co-ReAct provides explicit, actionable targets for evidence see…
CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception
CVSearch is a training-free framework that addresses the high-resolution image perception bottleneck in MLLMs by adaptively scheduling search strategies. It employs an "Assess-then-Search" workflow, prioritizing efficient expert-assisted search and only resorting to a novel semantic-aware scanning mechanism upon failur…
ETCHR: Editing To Clarify and Harness Reasoning
ETCHR addresses the limitations of purely textual reasoning in multimodal LLMs by introducing a novel approach that couples a dedicated image editing model with an understanding model. The core method involves conditioning the image editor on the reasoning question to overcome the editor's inability to map abstract que…
Goal-Conditioned Agents that Learn Everything All at Once
The paper introduces Learning Everything All at Once (LEO), a method for goal-conditioned reinforcement learning that efficiently performs off-policy updates using every observed transition for *all* possible goals simultaneously. LEO achieves this by jointly outputting values and actions for every goal in a single for…
HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval
HARNESS-LM (HLM) is a three-phase training recipe designed to efficiently transfer the high retrieval quality of large SLM-based models into compact, production-ready student encoders. The method first trains a large teacher model, then distills its knowledge into a small student encoder using an L2 alignment objective…
Human Decision-Making with Persuasive and Narrative LLM Explanations
This paper investigates how the persuasiveness of Large Language Model (LLM) narrative explanations affects human decision-making accuracy in classification tasks. The core finding is that the persuasiveness level of these explanations did not significantly improve decision accuracy compared to a simple AI prediction a…
Leveraging Foundation Models for Causal Generative Modeling
This paper introduces **FM-CGM**, a modular framework that leverages pretrained foundation models for visual causal reasoning without requiring explicit causal constraint training. It formalizes the causal pipeline using a concept extractor, manipulator, and counterfactual generator, employing a large reasoning model f…