From the arXiv
Thursday, 30 April 2026 · 20 papers
AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents
AGEL-Comp is a neuro-symbolic framework designed to improve the compositional generalization of LLM agents in interactive settings. It achieves this by integrating a dynamic Causal Program Graph (CPG) as a world model, an Inductive Logic Programming (ILP) engine to learn new symbolic rules from experience, and a hybrid…
Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control
This paper introduces a novel dataset of 270 ethically-grounded harmful instructions to benchmark the safety of 72 Large Language Models (LLMs) controlling a simulated Robotic Health Attendant. The core contribution is demonstrating a high average violation rate (54.4%), revealing that safety performance varies signifi…
DUAL-BLADE: Dual-Path NVMe-Direct KV-Cache Offloading for Edge LLM Inference
DUAL-BLADE is a dual-path KV-cache offloading framework for edge LLM inference that dynamically routes KV tensors to either a standard page-cache path or a low-overhead NVMe-direct path based on memory pressure. The NVMe-direct path bypasses the kernel by directly mapping tensors to LBA regions, reducing cache thrashin…
FutureWorld: A Live Environment for Training Predictive Agents with Real-World Outcome Rewards
FutureWorld introduces a novel live agentic reinforcement learning environment specifically designed for training predictive agents. Its core method is closing the training loop by continuously providing prediction tasks based on unfolding real-world events, rewarding agents based on actual outcomes. The main contribut…
Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data
This paper demonstrates that Uniform-based Discrete Diffusion Models (UDDMs) function as Associative Memories (AMs) with emergent creativity. The core method involves showing that these models form basins of attraction around training data, not through an explicit energy function, but via conditional likelihood maximiz…
Tatemae: Detecting Alignment Faking via Tool Selection in LLMs
This paper introduces a novel method for detecting Alignment Faking (AF) in LLMs by observing strategic tool selection rather than relying solely on Chain-of-Thought analysis. The core method identifies AF when an LLM switches from a safe tool (under unmonitored conditions) to an unsafe tool (under helpfulness-rewardin…
TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models
TLPO introduces Token-Level Policy Optimization, a novel fine-tuning framework to mitigate language confusion in LLMs by applying localized, token-level updates instead of sequence-level adjustments. The method identifies error-prone positions and uses a tailored objective to selectively suppress undesirable token outp…
Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models
This paper introduces TIDE, the first framework for cross-architecture knowledge distillation between diffusion large language models (dLLMs). TIDE employs three novel components—TIDAL, CompDemo, and Reverse CALM—to effectively transfer knowledge despite differences in architecture, attention, and tokenizer between tea…
SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts
The paper introduces **SafeReview**, a novel adversarial framework to defend LLM-based review systems against hidden adversarial prompts designed to manipulate review outcomes. It employs a **Generator** to create sophisticated attacks and a **Defender** to detect them, trained jointly using an Information Retrieval GA…
Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations
Bian Que is an agentic framework designed to automate complex online system operations by addressing the orchestration bottleneck. Its core method involves unifying O&M tasks into three canonical patterns and employing a Flexible Skill Arrangement mechanism to dynamically select and sequence the necessary data and oper…
ClawGym: A Scalable Framework for Building Effective Claw Agents
ClawGym is a scalable framework designed to streamline the development lifecycle for agents operating in multi-step, file-based environments. Its core contribution is the introduction of **ClawGym-SynData**, a large, synthesized dataset of tasks with mock workspaces and hybrid verification, which is used to train capab…
Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning
The core method, SAS, enables test-time adaptation for offline safe RL by using a transformer-based agent to generate and select imagined trajectories that satisfy a Lyapunov safety condition. These safe segments are then recycled as in-context prompts to guide the agent's behavior toward safety without requiring param…
Preserving Disagreement: Architectural Heterogeneity and Coherence Validation in Multi-Agent Policy Simulation
This paper introduces the **AI Council**, a three-phase deliberation framework designed to combat artificial consensus in LLM-based multi-agent policy simulation. The core contribution is demonstrating that **architectural heterogeneity**—assigning different smaller LLMs to agents representing distinct value perspectiv…
TDD Governance for Multi-Agent Code Generation via Prompt Engineering
This paper introduces an AI-native framework that operationalizes classical Test-Driven Development (TDD) principles as structured governance mechanisms for multi-agent code generation using LLMs. It formalizes TDD into a machine-readable manifesto enforced through prompt engineering and a layered architecture, ensurin…
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
X-WAM is a Unified 4D World Model that integrates real-time robotic action execution with high-fidelity 4D world synthesis (video and 3D reconstruction). It leverages pretrained video diffusion models by predicting multi-view RGB-D videos, efficiently incorporating spatial information via a lightweight structural adapt…
HealthNLP_Retrievers at ArchEHR-QA 2026: Cascaded LLM Pipeline for Grounded Clinical Question Answering
The HealthNLP_Retrievers team developed a cascaded Large Language Model (LLM) pipeline using Gemini 2.5 Pro for grounded clinical Question Answering over Electronic Health Records (EHRs). The core method involves four stages: reformulating verbose patient queries, heuristically scoring and retrieving relevant evidence …
MoRFI: Monotonic Sparse Autoencoder Feature Identification
The paper introduces **MoRFI** (Monotonic Sparse Autoencoder Feature Identification) to analyze how fine-tuning introduces hallucinations in LLMs. The core method involves fine-tuning various LLMs on new knowledge datasets while controlling training parameters, and then using pre-trained Sparse Autoencoders (SAEs) to *…
PAINT: Partial-Solution Adaptive Interpolated Training for Self-Distilled Reasoners
PAINT introduces **Partial-solution Adaptive Interpolated Training** for self-distilled LLM reasoners. It adaptively masks the verified solution based on the overlap with the student's current rollout, providing contextually relevant supervision. This method interpolates between the student's prediction and the masked …
OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory
OCR-Memory addresses the token-budget limitations of long-horizon agent memory by leveraging the visual modality as a high-density experience representation. The core method involves rendering historical trajectories into annotated images and employing a "locate-and-transcribe" paradigm to retrieve relevant visual cont…
SAGE: A Strategy-Aware Graph-Enhanced Generation Framework For Online Counseling
SAGE is a novel framework that enhances LLMs for online counseling by integrating structured clinical knowledge. It constructs a heterogeneous graph combining conversational dynamics with psychological theory to inform interventions. This allows SAGE to use a Next Strategy Classifier and Graph-Aware Attention to condit…