← Back to Blog

AI News Digest: May 15, 2026

Daily roundup of AI and ML news - 8 curated stories on security, research, and industry developments.

Here's your daily roundup of the most relevant AI and ML news for May 15, 2026. Today's digest includes 1 security-focused story. We're also covering 7 research developments. Click through to read the full articles from our curated sources.

Security & Safety

1. TanStack Supply Chain Attack Hits Two OpenAI Employee Devices, Forces macOS Updates

OpenAI has disclosed that two of its employee devices in its corporate environment were impacted via the Mini Shai-Hulud supply chain attack on TanStack, but noted that no user data, production systems, or intellectual property were compromised or modified in an unauthorized manner. "Upon identif...

Source: The Hacker News (Security) | 3 hours ago

Research & Papers

2. REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

arXiv:2605.12813v1 Announce Type: cross Abstract: Large language models (LLMs) achieve strong performance across many tasks but remain vulnerable to hallucinations, motivating the need for realistic adversarial prompts that elicit such failures. We formulate hallucination elicitation as a constr...

Source: arXiv - AI | 10 hours ago

3. Fast Adversarial Attacks with Gradient Prediction

arXiv:2605.14868v1 Announce Type: new Abstract: Generating adversarial examples at scale is a core primitive for robustness evaluation, adversarial training, and red-teaming, yet even "fast" attacks such as FGSM remain throughput-limited by the cost of a backward pass. We introduce a family of a...

Source: arXiv - Machine Learning | 10 hours ago

4. RTLC -- Research, Teach-to-Learn, Critique: A three-stage prompting paradigm inspired by the Feynman Learning Technique that lifts LLM-as-judge accuracy on JudgeBench with no fine-tuning

arXiv:2605.13695v1 Announce Type: cross Abstract: LLM-as-a-judge is now the default measurement instrument for open-ended generation, but on the public JudgeBench benchmark even strong instruction-tuned judges barely scrape past random on objective-correctness pairwise items. We introduce RTLC, ...

Source: arXiv - AI | 10 hours ago

5. Filter-then-Weight: Online Data Selection and Reweighting for LLM Fine-Tuning

arXiv:2604.00001v2 Announce Type: replace-cross Abstract: Gradient-based data selection offers a principled framework for estimating sample utility in large language model (LLM) fine-tuning, but existing methods are mostly designed for offline settings. They are therefore less suited to online f...

Source: arXiv - AI | 10 hours ago

6. Quantifying LLM Safety Degradation Under Repeated Attacks Using Survival Analysis

arXiv:2605.12869v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in a wide range of applications, yet remain vulnerable to adversarial jailbreak attacks that circumvent their safety guardrails. Existing evaluation frameworks typically report binary success...

Source: arXiv - AI | 10 hours ago

7. Physics-Grounded Adversarial Stain Augmentation with Calibrated Coverage Guarantees

arXiv:2605.13889v1 Announce Type: cross Abstract: Stain variation across hospitals degrades histopathology models at deployment. Existing augmentation methods perturb color spaces with arbitrary hyperparameters, lacking both a principled budget and coverage guarantees for unseen centers. We prop...

Source: arXiv - Machine Learning | 10 hours ago

8. The Compliance Trap: How Structural Constraints Degrade Frontier AI Metacognition Under Adversarial Pressure

arXiv:2605.02398v2 Announce Type: replace-cross Abstract: As frontier AI models are deployed in high-stakes decision pipelines, their ability to maintain metacognitive stability (knowing what they do not know, detecting errors, seeking clarification) under adversarial pressure is a critical safe...

Source: arXiv - Machine Learning | 10 hours ago


About This Digest

This digest is automatically curated from leading AI and tech news sources, filtered for relevance to AI security and the ML ecosystem. Stories are scored and ranked based on their relevance to model security, supply chain safety, and the broader AI landscape.

Want to see how your favorite models score on security? Check our model dashboard for trust scores on the top 500 HuggingFace models.