The Morning
From the arXiv
AEL: Agent Evolving Learning for Open-Ended Environments
he paper introduces Agent Evolving Learning (AEL), a two-timescale framework designed to enable LLM agents to effectively utilize past experience in open-ended environments. AEL employs fast-timescale Thompson Sampling to select the optimal memory retrieval policy for each episode, while a slow-timescale LLM reflection process diagnoses failures and injects causal insights into the agent's prompt. This method significantly improves performance on sequential tasks by providing a structured way to interpret and apply prior knowledge.

Alignment has a Fantasia Problem
The paper identifies "Fantasia interactions" as a core problem where AI treats incomplete user prompts as final intent, leading to misaligned assistance because users often lack fully formed goals. The contribution is arguing that alignment research must shift…
From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation
This paper introduces an agentic AI architecture to automate the translation of natural language research questions into executable scientific workflows. It achieves this by separating the process into three layers: an LLM for intent extraction, deterministic …


Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems
This paper introduces **DiffMAS**, a novel training framework that enables the **end-to-end, joint optimization of latent inter-agent communication** alongside multi-agent reasoning. It treats the internal, non-textual communication (like key-value caches) as …
Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models
This paper introduces **Nemobot Games**, an interactive engineering environment that operationalizes Shannon's game taxonomy using Large Language Models (LLMs) to create strategic AI agents. The core method involves leveraging the LLM's reasoning and synthesis…

Process Supervision via Verbal Critique Improves Reasoning in Large Language Models
This paper introduces Verbal Process Supervision (VPS), a training-free method that uses structured natural-language critique from a stronger model to iteratively guide an LLM's re…
Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers
This paper introduces **BadStyle**, a novel backdoor attack framework against LLMs that utilizes **natural style-level triggers** instead of explicit patterns. The core method invo…
StructMem: Structured Memory for Long-Horizon Behavior in LLMs
StructMem introduces a structure-enriched hierarchical memory framework for LLMs designed to capture event relationships essential for long-horizon reasoning. It achieves this by t…
Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows
This paper introduces **Tool Attention**, a middleware mechanism that replaces the costly, eager schema injection of the Model Context Protocol (MCP) with a dynamic, gated attentio…
Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models
The paper introduces **Transient Turn Injection (TTI)**, a novel multi-turn attack that exploits LLM vulnerabilities by distributing adversarial intent across isolated interactions…
The Town Square
DeepSeek has released documentation for its new DeepSeek v4 model, indicating an update to their large language model series.
Workshops
Hugging Face's `ml-intern` is an open-source ML engineer designed to autonomously read research papers, train models, and deploy complete machine learning solutions.
OSV-Scanner is a Go-based vulnerability scanner that leverages the comprehensive vulnerability data from OSV.dev to identify security issues in software projects.