№01
cs.AI arxiv:2604.21725v1

AEL: Agent Evolving Learning for Open-Ended Environments

Wujiang Xu, Jiaojiao Han, Minghao Guo et al.

The paper introduces Agent Evolving Learning (AEL), a two-timescale framework designed to enable LLM agents to effectively utilize past experience in open-ended environments. AEL employs fast-timescale Thompson Sampling to select the optimal memory retrieval policy for each episode, while a slow-timescale LLM reflectio…

9
№02
cs.AI arxiv:2604.21827v1

Alignment has a Fantasia Problem

Nathanael Jo, Zoe De Simone, Mitchell Gordon et al.

The paper identifies "Fantasia interactions" as a core problem where AI treats incomplete user prompts as final intent, leading to misaligned assistance because users often lack fully formed goals. The contribution is arguing that alignment research must shift from treating users as rational oracles to actively providi…

9
№03
cs.AI arxiv:2604.21910v1

From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation

Bartosz Balis, Michal Orzechowski, Piotr Kica et al.

This paper introduces an agentic AI architecture to automate the translation of natural language research questions into executable scientific workflows. It achieves this by separating the process into three layers: an LLM for intent extraction, deterministic generators for creating workflow DAGs, and expert-authored "…

9
№04
cs.AI arxiv:2604.21794v1

Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems

Ye Yu, Heming Liu, Haibo Jin et al.

This paper introduces **DiffMAS**, a novel training framework that enables the **end-to-end, joint optimization of latent inter-agent communication** alongside multi-agent reasoning. It treats the internal, non-textual communication (like key-value caches) as a learnable component, optimizing how information is encoded…

9
№05
cs.AI arxiv:2604.21896v1

Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models

Chee Wei Tan, Yuchen Wang, Shangxin Guo

This paper introduces **Nemobot Games**, an interactive engineering environment that operationalizes Shannon's game taxonomy using Large Language Models (LLMs) to create strategic AI agents. The core method involves leveraging the LLM's reasoning and synthesis capabilities to generate optimal or heuristic strategies ta…

9
№06
cs.AI arxiv:2604.21611v1

Process Supervision via Verbal Critique Improves Reasoning in Large Language Models

Hao-Yuan Chen

This paper introduces Verbal Process Supervision (VPS), a training-free method that uses structured natural-language critique from a stronger model to iteratively guide an LLM's reasoning process. VPS establishes a new axis for inference-time scaling by focusing on the granularity of external verbal supervision. This a…

9
№07
cs.AI arxiv:2604.21700v1

Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers

Jiali Wei, Ming Fan, Guoheng Sun et al.

This paper introduces **BadStyle**, a novel backdoor attack framework against LLMs that utilizes **natural style-level triggers** instead of explicit patterns. The core method involves using an LLM to generate stealthy poisoned samples with these style triggers while maintaining semantic fluency. BadStyle's contributio…

9
№08
cs.AI arxiv:2604.21748v1

StructMem: Structured Memory for Long-Horizon Behavior in LLMs

Buqiang Xu, Yijun Chen, Jizhan Fang et al.

StructMem introduces a structure-enriched hierarchical memory framework for LLMs designed to capture event relationships essential for long-horizon reasoning. It achieves this by temporally anchoring dual perspectives and performing semantic consolidation, which preserves event bindings and induces cross-event connecti…

9
№09
cs.AI arxiv:2604.21816v1

Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows

Anuj Sadani, Deepak Kumar

This paper introduces **Tool Attention**, a middleware mechanism that replaces the costly, eager schema injection of the Model Context Protocol (MCP) with a dynamic, gated attention system over available tools. It uses an Intent Schema Overlap (ISO) score and state-aware gating to select only necessary tool schemas, si…

9
№10
cs.AI arxiv:2604.21860v1

Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models

Naheed Rayhan, Sohely Jahan

The paper introduces **Transient Turn Injection (TTI)**, a novel multi-turn attack that exploits LLM vulnerabilities by distributing adversarial intent across isolated interactions, bypassing stateless moderation. TTI utilizes automated LLM agents to iteratively probe and evade policy enforcement, unlike traditional co…

9
№11
cs.LG arxiv:2604.21905v1

Low-Rank Adaptation Redux for Large Models

Bingcong Li, Yilang Zhang, Georgios B. Giannakis

This paper re-examines Low-Rank Adaptation (LoRA) by framing it through the lens of signal processing (SP) and classical low-rank modeling. The core contribution is providing a principled, theoretical understanding of the mechanisms behind LoRA and its variants, rather than just empirical comparison. This SP perspectiv…

9
№12
cs.CL arxiv:2604.21590v1

AgenticQwen: Training Small Agentic Language Models with Dual Data Flywheels for Industrial-Scale Tool Use

Yuanjie Lyu, Chengyu Wang, Haonan Zheng et al.

This paper introduces **AgenticQwen**, a family of small language models optimized for industrial-scale tool use and multi-step reasoning. The core method involves training these models using a novel framework combining reasoning and agentic Reinforcement Learning (RL) powered by **dual data flywheels**. These flywheel…

9
№13
cs.CL arxiv:2604.21564v1

Measuring Opinion Bias and Sycophancy via LLM-based Coercion

Rodrigo Nogueira, Giovana Kerche Bonás, Thales Sales Almeida et al.

This paper introduces **llm-bias-bench**, an open-source method to uncover the true opinions of Large Language Models (LLMs) on contested topics, overcoming their evasive disclaimers. The method uses two complementary, multi-turn, free-form probing strategies: **Direct Probing** (escalating pressure) and **Indirect Pro…

9
№14
cs.CL arxiv:2604.21882v1

Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms

Yuto Nishida, Naoki Shikoda, Yosuke Kishinami et al.

This paper introduces **RedirectQA**, a novel dataset that uses Wikipedia redirects to associate factual triples with multiple, categorized surface forms (aliases, variants, errors) for each entity. The core method analyzes how LLMs' factual recall changes when only the entity's surface form is altered, revealing that …

9
№15
cs.AI arxiv:2604.21579v1

A Metamorphic Testing Approach to Diagnosing Memorization in LLM-Based Program Repair

Milan De Koning, Ali Asgari, Pouria Derakhshanfar et al.

This paper introduces a metamorphic testing (MT) approach combined with negative log-likelihood (NLL) to diagnose data leakage (memorization) in LLM-based program repair. By applying semantics-preserving transformations to create variant benchmarks, the authors reveal substantial drops in repair success rates across se…

8
№16
cs.AI arxiv:2604.21584v1

CoFEE: Reasoning Control for LLM-Based Feature Discovery

Maximilian Westermann, Ben Griffin, Aaron Ontoyin Yin et al.

CoFEE is a reasoning control framework designed to improve feature discovery from unstructured data using Large Language Models (LLMs). It enforces specific "cognitive behaviors" during the LLM's reasoning process, which act as structured inductive biases. This method aims to generate higher-quality, predictive feature…

8
№17
cs.AI arxiv:2604.21598v1

DryRUN: On the Role of Public Tests in LLM-Driven Code Generation

Kaushitha Silva, Srinath Perera

DryRUN addresses the bottleneck of relying on human-provided public tests in LLM-driven code generation by proposing a method that operates without them. The core contribution is demonstrating that LLM agents can effectively debug and refine code using only *internal* execution feedback, mitigating the "overconfidence …

8
№18
cs.AI arxiv:2604.21536v1

Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

Nikita Severin, Danil Kartushov, Vladislav Urzhumov et al.

This paper introduces a novel knowledge distillation method to integrate rich user semantics from pre-trained LLMs into sequential recommenders. The core method distills LLM-generated textual user profiles into the recommender model, enabling it to capture deeper user understanding. The key contribution is achieving th…

8
№19
cs.CL arxiv:2604.21716v1

From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation

Minh Duc Bui, Xenia Heilmann, Mattia Cerrato et al.

This paper shifts bias evaluation in code generation from simple if-statements to the more realistic task of generating machine learning pipelines. The core contribution is demonstrating that this pipeline-based approach reveals significantly higher and more subtle bias, finding sensitive attributes in 87.7% of generat…

8
№20
cs.CL arxiv:2604.21871v1

Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions

Jiseon Kim, Jea Kwon, Luiz Felipe Vecchietti et al.

This paper investigates how LLMs handle relational nuances in moral dilemmas, specifically the Whistleblower's Dilemma, by varying crime severity and relational closeness. The core finding is a divergence: models judge moral rightness based on fairness, but predict human behavior shifts toward loyalty with increased cl…

8