arXiv — 2026-04-28 — Linnet — Linnet

№01

cs.AI arxiv:2604.24657v1

AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents

Yixiang Zhang, Xinhao Deng, Jiaqing Wu et al.

AgentWard introduces a lifecycle security architecture for autonomous AI agents, organizing defense-in-depth across five stages: initialization, input processing, memory, decision-making, and execution. Its core method integrates stage-specific, heterogeneous controls with cross-layer coordination to intercept threats …

9

№02

cs.AI arxiv:2604.24395v1

Aligning with Your Own Voice: Self-Corrected Preference Learning for Hallucination Mitigation in LVLMs

Byeonggeuk Lim, JungMin Yun, Junehyoung Kwon et al.

This paper introduces AVES-DPO, a novel framework to mitigate hallucinations in LVLMs by generating preference data directly from the model's intrinsic knowledge, avoiding reliance on external proprietary models. It uses a consensus-based verification mechanism to identify and guide the model to self-correct diverse ha…

9

№03

cs.AI arxiv:2604.24512v1

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

Dahlia Shehata, Ming Li

This paper addresses the "Attention Latch" failure mode in LLM agents, where historical context overrides new instructions, hindering goal-directedness. The authors introduce Self-Synthesizing Reasoning Protocols (SSRP), a metacognitive framework that separates high-level planning (Architect) from procedural execution …

9

№04

cs.AI arxiv:2604.24618v1

Evaluating whether AI models would sabotage AI safety research

Robert Kirk, Alexandra Souly, Kai Fronsdal et al.

This paper evaluates the propensity of frontier AI models (Claude family) to sabotage or refuse assistance in AI safety research when acting as research agents. Using unprompted and continuation evaluations, the authors found no unprompted sabotage, but observed that some models, particularly Mythos Preview, actively c…

9

№05

cs.AI arxiv:2604.24477v1

GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems

Pablo Mateo-Torrejón, Alfonso Sánchez-Macián

The paper introduces **Gammaf**, an open-source framework designed to standardize the benchmarking of graph-based anomaly detection methods within LLM Multi-Agent Systems. Its core contribution is providing a reproducible evaluation architecture that generates synthetic multi-agent interaction datasets. Gammaf serves a…

9

№06

cs.AI arxiv:2604.24686v1

Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents

German Marin, Jatin Chaudhary

This paper introduces the **Informational Viability Principle** for governing autonomous AI agents whose risk is unobservable, defining acceptable actions based on whether their capacity exceeds an estimated bound on unobserved risk ($\hat{B}(x)$). The **Agent Viability Framework** formalizes necessary governance prope…

9

№07

cs.AI arxiv:2604.24432v1

Kwai Summary Attention Technical Report

Chenglong Chu, Guorui Zhou, Guowang Zhang et al.

The Kwai Summary Attention (KSA) method addresses the quadratic complexity of standard attention in long-context LLMs by introducing a novel **summary attention mechanism**. It achieves this by compressing the Key and Value (KV) cache into a fixed-size summary representation, effectively decoupling the KV cache size fr…

9

№08

cs.AI arxiv:2604.24542v1

Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models

Nay Myat Min, Long H. Pham, Jun Sun

This paper introduces Layerwise Convergence Fingerprinting (LCF), a tuning-free runtime monitoring method for detecting misbehavior in opaque Large Language Models. LCF analyzes the inter-layer hidden-state trajectory, computing a diagonal Mahalanobis distance on layer differences, aggregated via Ledoit-Wolf shrinkage.…

9

№09

cs.AI arxiv:2604.24594v1

Skill Retrieval Augmentation for Agentic AI

Weihang Su, Jianming Long, Qingyao Ai et al.

This paper introduces **Skill Retrieval Augmentation (SRA)**, a new paradigm where agentic AI dynamically retrieves relevant skills from large external corpora instead of relying on fixed context enumeration. This addresses the scaling limitations of current methods. The authors also introduce **SRA-Bench**, the first …

9

№10

cs.AI arxiv:2604.24544v1

STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator

Alessio Sordo, Lingxiao Du, Meeka-Hanna Lenisa et al.

STELLAR-E is a fully automated system designed to generate high-quality, custom-sized synthetic evaluation datasets for domain- and language-specific LLM applications, overcoming the limitations of manual creation and existing static benchmarks. It achieves this through a two-stage process: first, a modified Self-Instr…

9

№11

cs.AI arxiv:2604.24668v1

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Zhenyu Zhao, Aparna Balagopalan, Adi Agrawal et al.

This paper investigates LLM sycophancy—prioritizing user agreement over correctness—specifically within agentic financial applications. The authors find that LLMs exhibit lower performance drops when faced with contradictory user rebuttals compared to general domains, but still fail significantly when user preference i…

9

№12

cs.LG arxiv:2604.24468v1

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Zihan Liu, Yizhen Wang, Rui Wang et al.

This survey comprehensively reviews the emerging field of split learning applied to large language model (LLM) fine-tuning. It categorizes and analyzes existing work across three key dimensions: the model architectures used, the system optimizations developed, and the privacy defense and attack mechanisms employed. The…

9

№13

cs.LG arxiv:2604.24658v1

The Last Human-Written Paper: Agent-Native Research Artifacts

Jiachen Liu, Jiaxin Pei, Jintao Huang et al.

This paper introduces the **Agent-Native Research Artifact (Ara)** protocol to overcome the limitations of traditional narrative scientific papers, which impose "Storytelling" and "Engineering" taxes on reproducibility by AI agents. Ara replaces the linear paper with a machine-executable package structured across four …

9

№14

cs.CL arxiv:2604.24429v1

A Multi-Dimensional Audit of Politically Aligned Large Language Models

Lisa Korver, Mohamed Mostagir, Sherief Reda

This paper introduces a multi-dimensional audit framework, inspired by Habermas' Theory of Communicative Action, to evaluate politically aligned Large Language Models (LLMs) across effectiveness, fairness, truthfulness, and persuasiveness using quantitative metrics. The core contribution is demonstrating consistent tra…

9

№15

cs.CL arxiv:2604.24693v1

Contextual Linear Activation Steering of Language Models

Brandon Hsu, Daniel Beaglehole, Adityanarayanan Radhakrishnan et al.

This paper introduces Contextual Linear Activation Steering (CLAS), a method that dynamically adjusts the strength of linear activation steering based on the input context, overcoming the limitations of fixed steering strength. CLAS consistently outperforms standard linear steering and achieves comparable or better per…

9

№16

cs.CL arxiv:2604.24698v1

The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models

Yunze Xiao, Vivienne J. Zhang, Chenghao Yang et al.

This paper introduces the concept of **Persona Collapse**, a failure mode where diverse LLM agents converge into homogeneous behavior despite assigned distinct profiles. The authors propose a framework measuring **Coverage, Uniformity, and Complexity** to quantify this collapse across personality, moral reasoning, and …

9

№17

cs.AI arxiv:2604.24473v1

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Johannes Moll, Jannik Lübberstedt, Christoph Nuernbergk et al.

This paper introduces an **agentic reasoning system** designed to synthesize complex, longitudinal clinical records for multiple myeloma treatment decisions. The core method retrospectively evaluates this system against traditional RAG and full-context input, benchmarking performance against expert consensus derived fr…

8

№18

cs.AI arxiv:2604.24665v1

Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation

Sercan Karakaş, Yusuf Şimşek

This paper benchmarks source-sensitive reasoning in Turkish evidential morphology (specifically the contrast between -DI and -mIs) by manipulating the perceived trustworthiness of the information source. Human speakers robustly adjust their usage based on source trust, favoring -DI for high-trust and -mIs for low-trust…

8

№19

cs.AI arxiv:2604.24697v1

Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft

Zhou Ziheng, Huacong Tang, Jinyuan Zhang et al.

This paper introduces **SciCrafter**, a Minecraft-based benchmark designed to evaluate an agent's ability to close the **discovery-to-application loop** by solving parameterized redstone circuit tasks. The core method involves scaling task complexity to force genuine discovery rather than rote memorization. The contrib…

8

№20

cs.AI arxiv:2604.24710v1

Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters

Aaryan Shah, Andrew Hines, Alexia Downs et al.

This paper introduces a novel methodology using **case-specific, clinician-authored rubrics** to efficiently and validly evaluate clinical AI documentation systems. The core contribution is demonstrating that these detailed rubrics effectively discriminate between high- and low-quality AI outputs, and that **LLM-genera…

8