Daily Issue
Vol. I — No. 19
25 · 05
Monday, 25 May 2026
Generated 2026-05-25 12:50
google/gemini-2.5-flash-lite-preview-09-2025
贾维斯,在我们学会走之前,要先学会跑! — 托尼·斯塔克 33 items · 4 sections
§ 0

The Morning

Local weather 1
This morning in
London
Clear sky
Today's range
34.5°19.8°
currently 33.0°
Feels
33.6°
Rain
0%
Wind
10 km/h
Humid
26%
Rise
04:55
Set
20:59
§ I

US Stocks

Pre-market signal radar
US pre-market radar
closed 2026-05-25
0 Bullish
0 Bearish
0 Neutral
US market is closed

weekend or us market holiday

Generated from public market data and news for research and education. Not financial advice; data may be delayed, incomplete, or wrong.

§ II

From the arXiv

arXiv preprints 10 of 20
cs.AIarxiv:2605.23459v1Lead article

AI Assurance: A Comprehensive Testing Strategy for Enterprise AI Systems

Chitra Badagi, Divye Singh, Animesh Sen, Adinath Shirsath

his paper proposes a comprehensive AI assurance strategy for enterprise AI systems, shifting focus from classical verification to continuous risk reduction. The core method involves treating evaluation as a core engineering discipline, structured around a new AI Failure Taxonomy and a five-layer AI Assurance Pyramid. The contribution is a practical framework to manage the unique, probabilistic risks introduced by LLM-based systems in enterprise settings.

The overall framework of ASAM, which consists of two key modules. ❶ LAR. Given multimodal inputs, LAR perturbs input embeddings along LLM-guided gradients to generate semantically consistent rephrases. ❷ RCSL. Using these rephrases, RCSL applies SVD-based subspace learning to align editing-layer outputs, enforcing semantic consistency across variants.
The overall framework of ASAM, which consists of two key modules. ❶ LAR. Given multimodal inputs, LAR perturbs input embeddings along LLM-guided gradients to generate semantically consistent rephrases…
cs.AIarxiv:2605.23780v1

Beyond Binary Edits Robust Multimodal Knowledge Editing with Adversarial Subspace Alignment

Haoyuan Wang, Xiaohao Liu et al.

This paper introduces Latent Adversarial Robustification (LAR) to improve the generality of intrinsic multimodal knowledge editing in MLLMs. LAR generates adversarial, semantically coherent variants in the latent space to expose fragile editing regions, ensuri…

cs.AIarxiv:2605.23605v1

DiLaDiff: Distilled Latent-Augmented Diffusion for Language Modeling

Jean-Marie Lemercier, Tomas Geffner et al.

DiLaDiff addresses the token correlation issue in diffusion language models by introducing a continuous, semantically rich latent space learned via an autoencoder. This latent space guides a diffusion model, and a subsequent consistency model distills this pro…

DiLaDiff: hybrid continuous-discrete diffusion with self-distilled latent. The latent space is crafted with encoder ℰ \( \mathcal{E}_{\phi} \) and decoder 𝐱 θ {\( \mathbf{x} \)}_{\( \theta \)} and learned a posteriori with a diffusion process with denoiser 𝐳 ψ {\( \mathbf{z} \)}_{\( \psi \)} . The latent diffusion trajectories are further self-distilled with MeanFlow student 𝐮 η ​ ( 𝐳 τ , τ , r ) \( \mathbf{u}_{\eta} \)({\( \mathbf{z} \)}_{\( \tau \)},\( \tau \),r) .
DiLaDiff: hybrid continuous-discrete diffusion with self-distilled latent. The latent space is crafted with encoder ℰ \( \mathcal{E}_{\phi} \) and decoder 𝐱 θ {\( \mathbf{x} \)}_{\( \theta \)} and le…
Overview of our study design. We evaluate the full trajectory-to-skill lifecycle across three stages: experience generation, skill extraction, and skill consumption.
Overview of our study design. We evaluate the full trajectory-to-skill lifecycle across three stages: experience generation, skill extraction, and skill consumption.
cs.AIarxiv:2605.23899v1

From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

Zisu Huang, Jingwen Xu et al.

This paper systematically studies the full lifecycle of model-generated agent skills, spanning experience generation, extraction, and consumption. The core contribution is a utility-grounded evaluation framework applied across five diverse domains to determine…

cs.AIarxiv:2605.23825v1

It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

Stuart Bladon, Brinnae Bent

This paper demonstrates that geopolitical bias in LLMs primarily originates during the **post-training (fine-tuning/alignment) phase**, contrary to common assumptions about pre-training data. The authors found that models consistently develop biases favoring t…

Overview, seven families. (A) Per-country preference base → \( \to \) post-trained; for the six non-GLM bases, cross-country spread \( \sigma \) grows post-training (Qwen 3.9 → 30.3 3.9\( \to \) 30.3 pp). (B) Post-training \( \Delta \) in China-favourability (EN, coherent subset). 3/3 Western labs shift anti-China; 3/4 Chinese labs shift pro-China; Yi shifts anti-China after prefill correction. GLM is shown with its (atypical) base preserved for completeness; see § Bias Is Created by Post-Training, Not Pretraining . The legend’s low-compliance encoding is described in § What MCQ Compliance Tells Us About Validity . (C) ZH − - EN shift on post-trained models: 5/7 descriptively pro-China but population-level claim is not statistically separable from the base trend (§ Linguistic Identity Modulates the Post-Training Bias ).
Overview, seven families. (A) Per-country preference base → \( \to \) post-trained; for the six non-GLM bases, cross-country spread \( \sigma \) grows post-training (Qwen 3.9 → 30.3 3.9\( \to \) 30.3 …
№06
cs.AI
9

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Xu Ouyang, Deyi Liu et al.

This paper introduces the **Shannon Scaling Law**, modeling LLM training as information transmission over a noisy channel, mapping parameters to bandwidth and data to signal power.…

№07
cs.AI
9

MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

Zhewen Tan, Yilun Yao et al.

MemAudit is a post-hoc auditing framework designed to identify malicious memories injected into LLM agents' persistent storage. It combines a counterfactual memory influence score …

№08
cs.AI
9

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Yifan Yang, Ziyang Gong et al.

SkillOpt introduces a novel method to systematically optimize agent skills by treating the skill itself as an external, trainable state, analogous to weight optimization in deep le…

№09
cs.LG
9

Push Your Agent: Measuring and Enforcing Quantitative Goal Persistence in Long-Horizon LLM Agents

Yuandao Cai, Yuzhang Zhu et al.

This paper introduces **Quantitative Goal Persistence (QGP)**, a metric to measure whether long-horizon LLM agents continue working until an external verifier confirms a specific c…

№10
cs.LG
9

Strong Teacher Not Needed? On Distillation in LLM Pretraining

Taiming Lu, Zhuang Liu

This paper investigates the conventional assumption that stronger teachers are necessary for effective knowledge distillation during Large Language Model (LLM) pretraining. The aut…

§ III

The Town Square

Hacker News 4
compiled overnight by google/gemini-2.5-flash-lite-preview-09-2025 · end of issue no. 19 · thank you for reading