Here's your daily roundup of the most relevant AI and ML news for May 12, 2026. Today's digest includes 1 security-focused story. We're also covering 7 research developments. Click through to read the full articles from our curated sources.
Security & Safety
1. Mini Shai-Hulud Worm Compromises TanStack, Mistral AI, Guardrails AI & More Packages
TeamPCP, the threat actor behind the recent supply chain attack spree, has been linked to the compromise of the npm and PyPI packages from TanStack, UiPath, Mistral AI, OpenSearch, and Guardrails AI as part of a fresh Mini Shai-Hulud campaign. The affected npm packages have been modified to ...
Source: The Hacker News (Security) | 2 hours ago
Research & Papers
2. The Art of the Jailbreak: Formulating Jailbreak Attacks for LLM Security Beyond Binary Scoring
arXiv:2605.09225v1 Announce Type: cross Abstract: Jailbreak attacks -- adversarial prompts that bypass LLM alignment through purely linguistic manipulation -- pose a growing operational security threat, yet the field lacks large-scale, reproducible infrastructure for generating, categorizing, an...
Source: arXiv - Machine Learning | 10 hours ago
3. LLM Wardens: Mitigating Adversarial Persuasion with Third-Party Conversational Oversight
arXiv:2605.08321v1 Announce Type: new Abstract: LLMs are increasingly capable of persuasion, which raises the question of how to protect users against manipulation. In a preregistered user study (N=120) across four decision-making scenarios, we find that an adversarial LLM with a hidden goal suc...
Source: arXiv - Machine Learning | 10 hours ago
4. ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection
arXiv:2604.11790v2 Announce Type: replace-cross Abstract: Tool-augmented Large Language Model (LLM) agents have demonstrated impressive capabilities in automating complex, multi-step real-world tasks, yet remain vulnerable to indirect prompt injection. Adversaries exploit this weakness by embedd...
Source: arXiv - AI | 10 hours ago
5. GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives
arXiv:2605.09027v1 Announce Type: cross Abstract: In multi-agent systems (MAS), a single deceptive agent can nullify all gains of an agentic AI collective and evade deployed defenses. However, existing adversarial studies on MAS target only shallow tasks and do not consider adaptive adversaries,...
Source: arXiv - Machine Learning | 10 hours ago
6. Enhancing Adversarial Robustness in Network Intrusion Detection: A Layer-wise Adaptive Regularization Approach
arXiv:2605.08910v1 Announce Type: cross Abstract: The new wave of adversarial attacks that utilize gradient-related vulnerabilities in neural network-based classifiers makes Network Intrusion Detection Systems more open to such threats. Although state-of-the-art adversarial training methods have...
Source: arXiv - Machine Learning | 10 hours ago
7. Preventing Prompt Injection with Type-Directed Privilege Separation
arXiv:2509.25926v2 Announce Type: replace-cross Abstract: Modern language models have enabled the development of agentic systems that achieve strong performance on reasoning-intensive tasks. Unfortunately, this has come with a security cost; these systems are vulnerable to prompt injection, a sp...
Source: arXiv - Machine Learning | 10 hours ago
8. BoostLLM: Boosting-inspired LLM Fine-tuning for Few-shot Tabular Classification
arXiv:2605.06117v2 Announce Type: replace Abstract: Large language models (LLMs) have recently been adapted to tabular prediction by serializing structured features into natural language, but their performance in low-data regimes remains limited compared to gradient-boosted decision trees (GBDTs...
Source: arXiv - Machine Learning | 10 hours ago
About This Digest
This digest is automatically curated from leading AI and tech news sources, filtered for relevance to AI security and the ML ecosystem. Stories are scored and ranked based on their relevance to model security, supply chain safety, and the broader AI landscape.
Want to see how your favorite models score on security? Check our model dashboard for trust scores on the top 500 HuggingFace models.