From the arXiv
Thursday, 28 May 2026 · 20 papers
Blind PRNG Hijacking: An Undetectable Integrity-Preserving Attack Against LLM Watermarking
This paper introduces **SeedHijack**, a novel, undetectable attack against LLM watermarking that targets the underlying Pseudo-Random Number Generator (PRNG) in the supply chain. The core method replaces the PRNG to bias green-list selection without altering the output tokens or requiring knowledge of the watermark key…
DREAM-R: Multimodal Speculative Reasoning with RL-Based Refined Drafting, Precise Verification, and Fully Parallel Execution
DREAM-R enhances speculative reasoning in multimodal models using a novel reinforcement learning objective, Speculative Alignment Policy Optimization (SAPO), to train draft models for generating concise and faithful reasoning steps. It incorporates a Threshold-based Verification Mechanism (TBVM) for stable acceptance o…
LiveBrowseComp: Are Search Agents Searching, or Just Verifying What They Already Know?
This paper introduces the **LiveBrowseComp** benchmark to diagnose whether LLM search agents genuinely search or merely verify their intrinsic knowledge. The core method involves analyzing agent behavior on the original BrowseComp dataset, revealing significant **Intrinsic Knowledge Dependence (IKD)** where agents rely…
MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems
MemTrace introduces a novel framework to trace and attribute errors in large language model memory systems by transforming memory pipelines into executable memory evolution graphs. This allows for fine-grained tracking of information flow and systematic analysis of failure modes using the new MemTraceBench benchmark. T…
OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration
This paper introduces OmniVerifier-M1, a multimodal meta-verifier that uses symbolic outputs (like bounding boxes) as effective rationales for training, outperforming textual explanations. The core method involves decoupling the reinforcement learning objectives for binary judgment and meta-verification, which signific…
Position: Retire the "Positive Backdoor" Label -- Secret Alignment Requires Strict and Systematic Evaluation
This paper argues for retiring the term "positive backdoor" and replacing it with "Secret Alignment" to describe trigger-activated hidden behaviors in AI models. The core contribution is establishing that security claims based on Secret Alignment should be considered insecure by default, requiring rigorous, standardize…
Rethinking Memory as Continuously Evolving Connectivity
This paper introduces **FluxMem**, a novel memory framework for LLM agents that models memory as a **continuously evolving, heterogeneous graph**. FluxMem dynamically refines its topology through stages of formation, feedback-driven refinement, and consolidation, allowing it to adapt to dynamic environments by repairin…
Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem
This paper analyzes 3,984 AI agent skills to uncover emerging security threats within the agent skill ecosystem. The core contribution is the identification of 76 confirmed malicious payloads and the development of a real-world threat taxonomy based on observed attack patterns, demonstrating that a significant percenta…
The Importance of Being Statistically Earnest: A Critical Re-evaluation of GSM-Symbolic
This paper critically re-evaluates the GSM-Symbolic benchmark, arguing its conclusion of widespread LLM reasoning failure is statistically unsound. Using Generalised Linear Mixed Models, the authors find only half the tested models show statistically significant performance drops under the original prompting. Furthermo…
TRACER: Turn-level Regret Matching with Inner Reinforcement Credit for Cooperative Multi-LLM Reasoning
TRACER is a novel turn-level reinforcement framework designed to integrate reinforcement learning with multi-LLM cooperation. It uses a controller-regret layer employing regret matching to decide whether agents should speak or skip, and a generation-credit layer that optimizes utterances using role-specific rewards. Th…
Interpretability-Guided Layer Selection over Subspace Projection: SAEs as Stethoscopes, Not Scalpels, for Raw Task Vector Model Editing
This paper investigates using Sparse Autoencoders (SAEs) to guide model editing by projecting task vectors onto SAE feature subspaces for mathematical reasoning. The core finding is that this projection acts as an information bottleneck, discarding most modification energy and failing to yield significant improvements …
PEFT-Arena: Understanding Parameter-Efficient Finetuning from a Stability-Plasticity Perspective
This paper introduces **PEFT-Arena**, a benchmark that evaluates Parameter-Efficient Finetuning (PEFT) methods based on the **stability-plasticity dilemma**: balancing adaptation to a new task against retaining original capabilities. The core contribution is demonstrating that different PEFT methods exhibit distinct st…
Understanding Generalization and Forgetting in In-Context Continual Learning
This paper introduces the first theoretical framework to analyze in-context continual learning (ICL) in Large Language Models processing sequential, heterogeneous tasks within a single prompt. By modeling shared attention mechanisms, particularly linear and masked linear attention, the authors derive error expressions …
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning
This paper introduces AXPO (Agent eXplorative Policy Optimization) to address the "Thinking-Acting Gap" in agentic reasoning, where tool use is infrequent and often leads to failed learning signals. AXPO's core method involves fixing the successful thinking prefix of failed tool-using trajectories and then resampling t…
Mobile-Aptus: Confidence-Driven Proactive and Robust Interaction in MLLM-based Mobile-Using Agents
This paper introduces **Mobile-Aptus**, a confidence-driven framework to mitigate both over-execution and over-soliciting in MLLM-based mobile agents. The core method integrates a **universal confidence framework** across two stages: interaction capability empowerment and confidence bias correction. This allows agents …
Self-Improving Language Models with Bidirectional Evolutionary Search
This paper introduces Bidirectional Evolutionary Search (BES), a novel self-improvement framework for language models that overcomes the limitations of sparse feedback and restricted exploration in traditional search methods. BES couples a **forward search** using evolutionary operators to recombine trajectories, with …
Adaptive Multimodal Agents-Based Framework for Automatic Workflow Execution
This paper introduces an adaptive multimodal multi-agent framework for autonomous workflow execution that overcomes the limitations of fragmented, linear task processing. The core method involves an offline phase to construct a topological knowledge base from execution logs, which agents then leverage during inference.…
AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation
AutoScientists is a decentralized system of self-organizing AI agents designed for long-running scientific experimentation. Agents collaboratively interpret shared state, form teams around promising hypotheses, critique proposals, and share results to avoid redundant work. This approach significantly improves performan…
Calibrating Conservatism for Scalable Oversight
The paper introduces **Calibrated Collective Oversight (CCO)**, a method for scalable oversight of advanced AI agents. CCO aggregates diverse auxiliary scores into a penalty that measures deviation from a conservative baseline, allowing high-utility actions to proceed unless overseer concern accumulates. This conservat…
Do Agents Need Semantic Metadata? A Comparative Study in Agentic Data Retrieval
This paper compares the effectiveness of two agentic data retrieval methods: one using LLMs to search the open web, and another using an LLM agent specifically leveraging structured **schema.org semantic metadata**. The core contribution is an **LLM-as-a-judge evaluation** framework, aligned with FAIR principles, to as…