№01
cs.AI arxiv:2606.09613v1

AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving

Rakibul Hasan Rajib, Mengxin Zheng, Qian Lou

AGENTSERVESIM is a novel, hardware-aware simulator designed specifically for multi-turn LLM agent serving workloads. Its core contribution is modeling the stateful program execution dynamics of agents, including turn dependencies, tool gaps, and cross-turn KV-cache locality, which existing stateless simulators ignore. …

9
№02
cs.AI arxiv:2606.09751v1

Collaborative Human-Agent Protocol (CHAP)

Arsalan Shahid, Gordon Suttie, Philip Black

The Collaborative Human-Agent Protocol (CHAP) introduces a standard for the shared workspace in complex, multi-human, multi-agent collaborations where foundation models take on operational roles. Its core method is to formally specify the interaction protocol, focusing on capturing the crucial moment of human judgment …

9
№03
cs.AI arxiv:2606.09643v1

FMplex: Model Virtualization for Serving Extensible Foundation Models

Hetvi Shastri, Pragya Sharma, Walid A. Hanafy et al.

FMplex introduces a model virtualization substrate for serving Foundation Models (FMs) by treating the FM backbone as a shared resource. It presents each downstream task with a virtual FM (vFM), allowing independent customization and lifecycle management while sharing the costly physical backbone. This approach signifi…

9
№04
cs.AI arxiv:2606.09551v1

FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

Yuhan Ma, Yong Li, Stefan Schmid

FuseFSS introduces a novel compiler for efficient two-server secure LLM inference using Function Secret Sharing (FSS). It replaces bespoke per-operator protocols with a unified compilation pipeline that compactly specifies fixed-point nonlinearities. This allows for batched FSS evaluations of packed comparisons and vec…

9
№05
cs.AI arxiv:2606.09748v1

Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback

Rishabh Sabharwal, Hongru Wang, Amos Storkey et al.

This paper introduces a multi-turn evaluation framework to assess deep research agents' (DRAs) ability to improve based on feedback, moving beyond single-shot benchmarks. The core contribution is the **Research Gap Inference (RGI)** method, which analyzes rubric satisfaction to generate targeted, process-level feedback…

9
№06
cs.AI arxiv:2606.09692v1

Observability for Delegated Execution in Agentic AI Systems

Abhinav Mishra, Kumar Sharad

This paper addresses the challenge of tracking actions within specific delegation scopes in complex, agentic AI systems, where standard logs fail to distinguish between incompatible delegation assignments. The core method introduces an **agent-aware observability substrate** featuring a lightweight gateway and a common…

9
№07
cs.AI arxiv:2606.09826v1

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics

Mingxian Lin, Shengju Qian, Yuqi Liu et al.

OmniGameArena introduces a unified benchmark using twelve diverse Unreal Engine 5 games (Solo, PvP, Coop) to evaluate Vision-Language Model (VLM) agents fairly. Its core contribution is the Improvement Dynamics Curve (IDC), a harness where a reflector LLM autonomously refines agent prompts across multiple rounds. This …

9
№08
cs.AI arxiv:2606.09563v1

PRISM: Recovering Instruction Sets from Language Model Activations

Gilad Gressel, Rahul Pankajakshan, Julia Diament et al.

PRISM is a novel method designed to recover the complete set of active instructions, constraints, and subgoals steering a frozen Language Model's behavior by interpreting its internal activations. It formalizes this as instruction set retrieval and uses a judge-guided GRPO training scheme to directly decode a faithful …

9
№09
cs.AI arxiv:2606.09711v1

Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

Mohammad Beigi, Ming Jin, Lifu Huang

This paper introduces **PRIME (Proxy Reward Internalization and Mechanistic Exploitation)**, a learned capability in RL agents to assess task correctness, predict proxy reward acceptance, and reason about exploitable gaps between the proxy and true (gold) reward. The core contribution is demonstrating that PRIME emerge…

9
№10
cs.AI arxiv:2606.09730v1

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Pu Ning, Quan Chen, Kun Tao et al.

SearchSwarm introduces a method to enhance agentic LLMs for long-horizon tasks by developing "delegation intelligence." The core method involves training agents to effectively decompose complex research tasks, delegate subtasks to specialized subagents, and integrate summarized results to manage the main agent's finite…

9
№11
cs.AI arxiv:2606.09549v1

SecureClaw: Clawing Back Control of LLM Agents

Yuhan Ma, Stefan Schmid

SecureClaw introduces a dual-boundary architecture to secure LLM agents against unauthorized actions and plaintext exposure. It achieves this by implementing plaintext confinement at the read boundary using a trusted gateway that replaces sensitive reads with opaque handles or bounded summaries. Simultaneously, it enfo…

9
№12
cs.LG arxiv:2606.09764v1

iOSWorld: A Benchmark for Personally Intelligent Phone Agents

Lawrence Keunho Jang, Mareks Woodside, Geronimo Carom et al.

This paper introduces **iOSWorld**, the first interactive native iOS simulator benchmark designed to test personally intelligent phone agents. Its core method involves creating a persistent user identity across 26 interconnected apps containing rich personal data (messages, transactions, etc.) to support 133 complex ta…

9
№13
cs.LG arxiv:2606.09821v1

Rethinking the Divergence Regularization in LLM RL

Jiarui Yao, Xiangxin Zhou, Penghui Qi et al.

This paper proposes Divergence Regularized Policy Optimization (DRPO) to improve stable reinforcement learning for LLMs, addressing limitations in existing ratio-clipping and hard-mask divergence methods. DRPO replaces the hard mask used in divergence-based trust regions with a smooth, advantage-weighted quadratic regu…

9
№14
cs.LG arxiv:2606.09700v1

What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks

Qin Yang, Lu Malloy, Joshua Lee et al.

This paper introduces Human-Perceptible Adversarial Attacks (HPAA) to exploit the mismatch between human visual perception and text-based LLM moderation. The core method involves embedding harmful content within benign text using visually salient typographic manipulations (like spacing and emphasis). This allows the ha…

9
№15
cs.CL arxiv:2606.09635v1

Gradient-Guided Reward Optimization for Inference-time Alignment

Hankun Lin, Ruqi Zhang

Gradient-Guided Reward Optimization (GGRO) is a lightweight inference-time alignment method that addresses the limitations of sampling-based approaches like Best-of-$N$. GGRO monitors token entropy to detect uncertainty indicative of distribution drift and then injects "nudging tokens" guided by the reward model's grad…

9
№16
cs.CL arxiv:2606.09709v1

IS-CoT: Breaking the Long-form Generation Collapse via Interleaved Structural Thinking

Zechen Sun, Yuyang Sun, Zecheng Tang et al.

The paper introduces **Interleaved Structural Chain-of-Thought (IS-CoT)** to combat the performance degradation ("length collapse") LLMs experience during long-form generation. IS-CoT embeds a dynamic **Plan-Write-Reflect cycle** directly into the generation process, allowing for continuous strategy adaptation without …

9
№17
cs.CL arxiv:2606.09697v1

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

Gianluca Barmina, Federico Torrielli, Sven Harms et al.

PsychoSafe introduces a framework for LLM refusals that reframes them as structured, supportive communication based on evidence-based psychological intervention strategies. The method involves creating a specialized corpus across five risk domains and fine-tuning an LLM (Qwen 3.5 27B) using this data. This approach sig…

9
№18
cs.CL arxiv:2606.09735v1

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

Wendy K. Tam

This paper investigates how Reinforcement Learning from Human Feedback (RLHF) aligns Large Language Models (LLMs) by analyzing partisan orientation in Llama 3.1 8B. The core finding is that RLHF achieves only **shallow alignment** by compressing the variance of existing partisan structure, rather than removing it. This…

9
№19
cs.AI arxiv:2606.09674v1

(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs

Wesley Pegden

The paper introduces **Trellis**, an autoformalization system that uses LLM agents in a strictly controlled workflow to iteratively refine natural language proofs for formalization in Lean. Its core contribution is enforcing rigor by structuring the process around the mathematician's expectation that any proof step sho…

8
№20
cs.AI arxiv:2606.09556v1

AI Scientists Are Only as Good as Their Evidence: A Stratified Ablation of Proprietary Data and Reasoning Skills in Drug-Asset Valuation

Yinan Wang

This paper investigates the limiting factors for AI scientists in knowledge-intensive tasks like drug-asset valuation, hypothesizing that the accessible evidence substrate is key. Through a three-arm ablation study, they show that while adding reasoning scaffolds and structured tools (Arm B) improves calibration, the m…

8