Here's your daily roundup of the most relevant AI and ML news for February 27, 2026. Today's digest includes 1 security-focused story. We're also covering 7 research developments. Click through to read the full articles from our curated sources.
Security & Safety
1. Claude Code Flaws Allow Remote Code Execution and API Key Exfiltration
Cybersecurity researchers have disclosed multiple security vulnerabilities in Anthropic's Claude Code, an artificial intelligence (AI)-powered coding assistant, that could result in remote code execution and theft of API credentials. "The vulnerabilities exploit various configuration mechanisms, ...
Source: The Hacker News (Security) | 1 day ago
Research & Papers
2. Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace
arXiv:2602.22450v1 Announce Type: cross Abstract: Agentic large language model systems increasingly automate tasks by retrieving URLs and calling external tools. We show that this workflow gives rise to implicit prompt injection: adversarial instructions embedded in automatically generated URL p...
Source: arXiv - AI | 9 hours ago
3. Analysis of LLMs Against Prompt Injection and Jailbreak Attacks
arXiv:2602.22242v1 Announce Type: cross Abstract: Large Language Models (LLMs) are widely deployed in real-world systems. Given their broader applicability, prompt engineering has become an efficient tool for resource-scarce organizations to adopt LLMs for their own purposes. At the same time, L...
Source: arXiv - AI | 9 hours ago
4. AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification
arXiv:2602.22724v1 Announce Type: cross Abstract: Large language model (LLM) agents increasingly rely on external tools and retrieval systems to autonomously complete complex tasks. However, this design exposes agents to indirect prompt injection (IPI), where attacker-controlled context embedded...
Source: arXiv - AI | 9 hours ago
5. BioBlue: Systematic runaway-optimiser-like LLM failure modes on biologically and economically aligned AI safety benchmarks for LLMs with simplified observation format
arXiv:2509.02655v2 Announce Type: replace-cross Abstract: Many AI alignment discussions of "runaway optimisation" focus on RL agents: unbounded utility maximisers that over-optimise a proxy objective (e.g., "paperclip maximiser", specification gaming) at the expense of everything else. LLM-based...
Source: arXiv - AI | 9 hours ago
6. To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning
arXiv:2602.22227v1 Announce Type: new Abstract: Despite their impressive capabilities, Multimodal Large Language Models (MLLMs) exhibit perceptual fragility when confronted with visually complex scenes. This weakness stems from a reliance on finite training datasets, which are prohibitively expe...
Source: arXiv - Machine Learning | 9 hours ago
7. Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search
arXiv:2602.22983v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly used, their security risks have drawn increasing attention. Existing research reveals that LLMs are highly susceptible to jailbreak attacks, with effectiveness varying across language contexts. This ...
Source: arXiv - AI | 9 hours ago
8. Devling into Adversarial Transferability on Image Classification: Review, Benchmark, and Evaluation
arXiv:2602.23117v1 Announce Type: cross Abstract: Adversarial transferability refers to the capacity of adversarial examples generated on the surrogate model to deceive alternate, unexposed victim models. This property eliminates the need for direct access to the victim model during an attack, t...
Source: arXiv - AI | 9 hours ago
About This Digest
This digest is automatically curated from leading AI and tech news sources, filtered for relevance to AI security and the ML ecosystem. Stories are scored and ranked based on their relevance to model security, supply chain safety, and the broader AI landscape.
Want to see how your favorite models score on security? Check our model dashboard for trust scores on the top 500 HuggingFace models.