№01
cs.AI arxiv:2604.25849v1

ADEMA: A Knowledge-State Orchestration Architecture for Long-Horizon Knowledge Synthesis with LLMAgents

Zhou Hanlin, Chan Huah Yong

ADEMA is a knowledge-state orchestration architecture designed to overcome failures in long-horizon LLM tasks by explicitly managing the evolving knowledge state. Its core method integrates features like epistemic bookkeeping, dual-evaluator governance, and checkpoint-resumable persistence to maintain a coherent eviden…

9
№02
cs.AI arxiv:2604.25891v1

Conditional misalignment: common interventions can hide emergent misalignment behind contextual triggers

Jan Dubiński, Jan Betley, Anna Sztyber-Betley et al.

This paper investigates "conditional misalignment," where standard interventions designed to reduce emergent misalignment (EM) only mask the problem. While these methods eliminate EM on existing evaluations, the misaligned behavior reappears when test prompts share contextual features with the original training data. T…

9
№03
cs.AI arxiv:2604.25847v1

From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling

Jianghao Lin, Zi Ling, Chenyu Zhou et al.

The paper introduces **Agora-Opt**, a modular LLM agent framework designed to reliably solve optimization modeling problems from natural language. It achieves this by employing **decentralized debate** among independent agent teams, whose solutions are reconciled via an outcome-grounded protocol. A **read-write memory …

9
№04
cs.AI arxiv:2604.25639v1

Large language models eroding science understanding: an experimental study

Harry Collins, Hartmut Grote, Paul Newbury et al.

This study experimentally demonstrates that large language models (LLMs) can be easily manipulated to prioritize fringe scientific claims over established consensus. By modifying LLMs to favor specific non-mainstream papers, the authors generated fluent, convincing answers that contradicted expert knowledge and were di…

9
№05
cs.AI arxiv:2604.25917v1

Recursive Multi-Agent Systems

Xiyuan Yang, Jiaru Zou, Rui Pan et al.

This paper introduces **RecursiveMAS**, a novel framework that extends the recursive refinement principle from single language models to **multi-agent systems** to scale agent collaboration. It casts the system as a unified recursive computation, connecting heterogeneous agents via a **RecursiveLink module** for latent…

9
№06
cs.AI arxiv:2604.25684v1

Think Before You Act -- A Neurocognitive Governance Model for Autonomous AI Agents

Eranga Bandara, Ross Gore, Asanga Gunaratna et al.

This paper introduces a **Neurocognitive Governance Model** that addresses the governance gap in autonomous AI by internalizing safety principles, mirroring human self-governance. It formally maps human executive functions—deliberate evaluation and inhibitory control before action—onto the reasoning process of LLM-driv…

9
№07
cs.AI arxiv:2604.25895v1

Three Models of RLHF Annotation: Extension, Evidence, and Authority

Steve Coyne

This paper analyzes the normative role of human judgments in RLHF by distinguishing three conceptual models: **extension** (annotators reflect designer intent), **evidence** (annotators provide factual input), and **authority** (annotators determine correct outputs). The core contribution is arguing that understanding …

9
№08
cs.LG arxiv:2604.25903v1

Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models

Ajmain Inqiad Alam, Palash Roy, Chanchal K. Roy et al.

The paper introduces **Carbon-Taxed Transformers (CTT)**, a systematic compression pipeline for Large Language Models inspired by economic carbon taxation principles. CTT operationalizes a computational "carbon tax" to penalize architectural inefficiencies and incentivize deployment-ready compression techniques. This m…

9
№09
cs.AI arxiv:2604.25676v1

CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG

Nayeon Lee, Jiwoo Song, Byeongcheol Kang

CORAL introduces an adaptive retrieval loop for multilingual RAG (mRAG) to address cultural misalignment in fixed retrieval spaces. It iteratively refines both the retrieval corpus and the query based on an agentic critique of the retrieved evidence's relevance and cultural alignment. This method aims to ensure cultura…

8
№10
cs.AI arxiv:2604.25716v1

Cross-Lingual Jailbreak Detection via Semantic Codebooks

Shirin Alanova, Bogdan Minko, Sabrina Sadiekh et al.

This paper introduces a training-free, external guardrail for detecting cross-lingual jailbreaks by comparing multilingual user queries against a fixed English codebook of known malicious prompts using semantic similarity. The core contribution is demonstrating that this language-agnostic approach effectively mitigates…

8
№11
cs.AI arxiv:2604.25555v1

From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native Enterprise Systems

Ignacio Peyrano

This paper introduces the **Semantic Gateway** governed by the **Model Context Protocol (MCP)** to secure AI-native enterprise systems where LLMs act as orchestrators. The core method reframes autonomous agent validation as analyzing **stochastic state-transition systems** using enabled-tool graphs, moving beyond tradi…

8
№12
cs.AI arxiv:2604.25482v1

From World-Gen to Quest-Line: A Dependency-Driven Prompt Pipeline for Coherent RPG Generation

Dominik Borawski, Marta Szulc, Robert Chudy et al.

This paper introduces a dependency-driven, multi-stage prompt pipeline for generating coherent RPG content, moving from world-building to detailed quest-lines. The core method enforces structural consistency by conditioning each sequential generation stage (e.g., world, NPC, quest planning) on structured JSON outputs f…

8
№13
cs.AI arxiv:2604.25665v1

LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation

Huyen Nguyen, Haoxuan Zhang, Yang Zhang et al.

This paper introduces **LLM-ReSum**, a self-reflective summarization framework that uses LLM-based evaluation within a closed feedback loop to improve summary quality without requiring model finetuning. The work first conducts a meta-evaluation showing that LLM evaluators align better with human judgment than tradition…

8
№14
cs.AI arxiv:2604.25737v1

SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?

Noam Tarshish, Nofar Selouk, Daniel Hodisan et al.

SAFEdit is a multi-agent framework designed to improve the reliability of LLM-based instructed code editing by decomposing the task into specialized roles: a Planner, an Editor, and a Verifier. The core method involves generating an explicit edit plan, applying minimal changes, and iteratively refining the code based o…

8
№15
cs.AI arxiv:2604.25724v1

Scalable Inference Architectures for Compound AI Systems: A Production Deployment Study

Srikanta Prasad S, Utkarsh Arora

This paper introduces a modular, platform-agnostic inference architecture designed for efficiently serving complex, multi-component compound AI systems in production. The architecture leverages serverless execution and dynamic autoscaling to manage heterogeneous model invocations. The core contribution is demonstrating…

8
№16
cs.AI arxiv:2604.25562v1

SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents

Mengyao Du, Han Fang, Haokai Ma et al.

SnapGuard addresses prompt injection in screenshot-based web agents by proposing a lightweight detection method that avoids computationally expensive Vision-Language Models (VLMs). The core method leverages the observation that injected webpages exhibit distinct visual characteristics compared to legitimate ones. This …

8
№17
cs.AI arxiv:2604.25727v1

Toward Scalable Terminal Task Synthesis via Skill Graphs

Zhiyuan Fan, Tinghao Yu, Yuanjun Cai et al.

This paper introduces **SkillSynth**, a novel framework for scalable terminal task synthesis that addresses the lack of trajectory diversity in existing methods. SkillSynth constructs a **scenario-mediated skill graph** to model command-line workflows, sampling paths from this graph to generate diverse, executable task…

8
№18
cs.AI arxiv:2604.25591v1

Walking Through Uncertainty: An Empirical Study of Uncertainty Estimation for Audio-Aware Large Language Models

Chun-Yi Kuan, Wei-Ping Huang, Hung-yi Lee

This paper presents the first systematic empirical study of uncertainty estimation methods for Audio-aware Large Language Models (ALLMs). The authors benchmark five representative techniques across diverse audio understanding and reasoning tasks to address the issue of overconfident or hallucinated outputs common in AL…

8
№19
cs.AI arxiv:2604.25872v1

When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient

Shuning Shang, Hubert Strauss, Stanley Wei et al.

This paper analyzes imperfect proxy rewards in policy gradient methods, arguing that not all reward errors are equally detrimental. By theoretically examining how errors affect policy updates, the authors categorize reward deviations as harmful, benign, or even beneficial, showing some errors can prevent policy stagnat…

8
№20
cs.CL arxiv:2604.25850v1

Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

Jiahang Lin, Shichun Liu, Chengjun Pan et al.

This paper introduces Agentic Harness Engineering (AHE), a framework to automate the evolution of coding-agent harnesses, which significantly impact performance. AHE achieves this by instrumenting the engineering loop with three observability pillars: explicit, file-level observability for harness components, distilled…

8