№01
cs.AI arxiv:2604.26522v1

AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents

Mahnoor Shahid, Hannes Rothe

AGEL-Comp is a neuro-symbolic framework designed to improve the compositional generalization of LLM agents in interactive settings. It achieves this by integrating a dynamic Causal Program Graph (CPG) as a world model, an Inductive Logic Programming (ILP) engine to learn new symbolic rules from experience, and a hybrid…

9
№02
cs.AI arxiv:2604.26577v1

Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control

Mahiro Nakao, Kazuhiro Takemoto

This paper introduces a novel dataset of 270 ethically-grounded harmful instructions to benchmark the safety of 72 Large Language Models (LLMs) controlling a simulated Robotic Health Attendant. The core contribution is demonstrating a high average violation rate (54.4%), revealing that safety performance varies signifi…

9
№03
cs.AI arxiv:2604.26557v1

DUAL-BLADE: Dual-Path NVMe-Direct KV-Cache Offloading for Edge LLM Inference

Bodon Jeong, Hongsu Byun, Youngjae Kim et al.

DUAL-BLADE is a dual-path KV-cache offloading framework for edge LLM inference that dynamically routes KV tensors to either a standard page-cache path or a low-overhead NVMe-direct path based on memory pressure. The NVMe-direct path bypasses the kernel by directly mapping tensors to LBA regions, reducing cache thrashin…

9
№04
cs.AI arxiv:2604.26733v1

FutureWorld: A Live Environment for Training Predictive Agents with Real-World Outcome Rewards

Zhixin Han, Yanzhi Zhang, Chuyang Wei et al.

FutureWorld introduces a novel live agentic reinforcement learning environment specifically designed for training predictive agents. Its core method is closing the training loop by continuously providing prediction tasks based on unfolding real-world events, rewarding agents based on actual outcomes. The main contribut…

9
№05
cs.AI arxiv:2604.26841v1

Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

Bao Pham, Mohammed J. Zaki, Luca Ambrogioni et al.

This paper demonstrates that Uniform-based Discrete Diffusion Models (UDDMs) function as Associative Memories (AMs) with emergent creativity. The core method involves showing that these models form basins of attraction around training data, not through an explicit energy function, but via conditional likelihood maximiz…

9
№06
cs.AI arxiv:2604.26511v1

Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

Matteo Leonesi, Francesco Belardinelli, Flavio Corradini et al.

This paper introduces a novel method for detecting Alignment Faking (AF) in LLMs by observing strategic tool selection rather than relying solely on Chain-of-Thought analysis. The core method identifies AF when an LLM switches from a safe tool (under unmonitored conditions) to an unsafe tool (under helpfulness-rewardin…

9
№07
cs.AI arxiv:2604.26553v1

TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models

Jinho Choo, JunSeung Lee, Jimyeong Kim et al.

TLPO introduces Token-Level Policy Optimization, a novel fine-tuning framework to mitigate language confusion in LLMs by applying localized, token-level updates instead of sequence-level adjustments. The method identifies error-prone positions and uses a tailored objective to selectively suppress undesirable token outp…

9
№08
cs.AI arxiv:2604.26951v1

Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models

Gongbo Zhang, Wen Wang, Ye Tian et al.

This paper introduces TIDE, the first framework for cross-architecture knowledge distillation between diffusion large language models (dLLMs). TIDE employs three novel components—TIDAL, CompDemo, and Reverse CALM—to effectively transfer knowledge despite differences in architecture, attention, and tokenizer between tea…

9
№09
cs.CL arxiv:2604.26506v1

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

Yuan Xin, Yixuan Weng, Minjun Zhu et al.

The paper introduces **SafeReview**, a novel adversarial framework to defend LLM-based review systems against hidden adversarial prompts designed to manipulate review outcomes. It employs a **Generator** to create sophisticated attacks and a **Defender** to detect them, trained jointly using an Information Retrieval GA…

9
№10
cs.AI arxiv:2604.26805v1

Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations

Bochao Liu, Zhipeng Qian, Yang Zhao et al.

Bian Que is an agentic framework designed to automate complex online system operations by addressing the orchestration bottleneck. Its core method involves unifying O&M tasks into three canonical patterns and employing a Flexible Skill Arrangement mechanism to dynamically select and sequence the necessary data and oper…

8
№11
cs.AI arxiv:2604.26904v1

ClawGym: A Scalable Framework for Building Effective Claw Agents

Fei Bai, Huatong Song, Shuang Sun et al.

ClawGym is a scalable framework designed to streamline the development lifecycle for agents operating in multi-step, file-based environments. Its core contribution is the introduction of **ClawGym-SynData**, a large, synthesized dataset of tasks with mock workspaces and hybrid verification, which is used to train capab…

8
№12
cs.AI arxiv:2604.26516v1

Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning

Seungyub Han, Hyungjin Kim, Jungwoo Lee

The core method, SAS, enables test-time adaptation for offline safe RL by using a transformer-based agent to generate and select imagined trajectories that satisfy a Lyapunov safety condition. These safe segments are then recycled as in-context prompts to guide the agent's behavior toward safety without requiring param…

8
№13
cs.AI arxiv:2604.26561v1

Preserving Disagreement: Architectural Heterogeneity and Coherence Validation in Multi-Agent Policy Simulation

Ariel Sela

This paper introduces the **AI Council**, a three-phase deliberation framework designed to combat artificial consensus in LLM-based multi-agent policy simulation. The core contribution is demonstrating that **architectural heterogeneity**—assigning different smaller LLMs to agents representing distinct value perspectiv…

8
№14
cs.AI arxiv:2604.26615v1

TDD Governance for Multi-Agent Code Generation via Prompt Engineering

Tarlan Hasanli, Shahbaz Siddeeq, Bishwash Khanal et al.

This paper introduces an AI-native framework that operationalizes classical Test-Driven Development (TDD) principles as structured governance mechanisms for multi-agent code generation using LLMs. It formalizes TDD into a machine-readable manifesto enforced through prompt engineering and a layered architecture, ensurin…

8
№15
cs.AI arxiv:2604.26694v1

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

Jun Guo, Qiwei Li, Peiyan Li et al.

X-WAM is a Unified 4D World Model that integrates real-time robotic action execution with high-fidelity 4D world synthesis (video and 3D reconstruction). It leverages pretrained video diffusion models by predicting multi-view RGB-D videos, efficiently incorporating spatial information via a lightweight structural adapt…

8
№16
cs.LG arxiv:2604.26880v1

HealthNLP_Retrievers at ArchEHR-QA 2026: Cascaded LLM Pipeline for Grounded Clinical Question Answering

Md Biplob Hosen, Md Alomgeer Hussein, Md Akmol Masud et al.

The HealthNLP_Retrievers team developed a cascaded Large Language Model (LLM) pipeline using Gemini 2.5 Pro for grounded clinical Question Answering over Electronic Health Records (EHRs). The core method involves four stages: reformulating verbose patient queries, heuristically scoring and retrieving relevant evidence …

8
№17
cs.LG arxiv:2604.26866v1

MoRFI: Monotonic Sparse Autoencoder Feature Identification

Dimitris Dimakopoulos, Shay B. Cohen, Ioannis Konstas

The paper introduces **MoRFI** (Monotonic Sparse Autoencoder Feature Identification) to analyze how fine-tuning introduces hallucinations in LLMs. The core method involves fine-tuning various LLMs on new knowledge datasets while controlling training parameters, and then using pre-trained Sparse Autoencoders (SAEs) to *…

8
№18
cs.LG arxiv:2604.26573v1

PAINT: Partial-Solution Adaptive Interpolated Training for Self-Distilled Reasoners

Zhiquan Tan, Yinrong Hong

PAINT introduces **Partial-solution Adaptive Interpolated Training** for self-distilled LLM reasoners. It adaptively masks the verified solution based on the overlap with the student's current rollout, providing contextually relevant supervision. This method interpolates between the student's prediction and the masked …

8
№19
cs.CL arxiv:2604.26622v1

OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory

Jinze Li, Yang Zhang, Xin Yang et al.

OCR-Memory addresses the token-budget limitations of long-horizon agent memory by leveraging the visual modality as a high-density experience representation. The core method involves rendering historical trajectories into annotated images and employing a "locate-and-transcribe" paradigm to retrieve relevant visual cont…

8
№20
cs.CL arxiv:2604.26630v1

SAGE: A Strategy-Aware Graph-Enhanced Generation Framework For Online Counseling

Eliya Naomi Aharon, Meytal Grimland, Avi Segal et al.

SAGE is a novel framework that enhances LLMs for online counseling by integrating structured clinical knowledge. It constructs a heterogeneous graph combining conversational dynamics with psychological theory to inform interventions. This allows SAGE to use a Next Strategy Classifier and Graph-Aware Attention to condit…

8