Here's your daily roundup of the most relevant AI and ML news for May 14, 2026. We're also covering 8 research developments. Click through to read the full articles from our curated sources.
Research & Papers
1. REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations
arXiv:2605.12813v1 Announce Type: cross Abstract: Large language models (LLMs) achieve strong performance across many tasks but remain vulnerable to hallucinations, motivating the need for realistic adversarial prompts that elicit such failures. We formulate hallucination elicitation as a constr...
Source: arXiv - AI | 10 hours ago
2. GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives
arXiv:2605.09027v2 Announce Type: replace-cross Abstract: In multi-agent systems (MAS), a single deceptive agent can nullify all gains of an agentic AI collective and evade deployed defenses. However, existing adversarial studies on MAS target only shallow tasks and do not consider adaptive adve...
Source: arXiv - Machine Learning | 10 hours ago
3. RTLC -- Research, Teach-to-Learn, Critique: A three-stage prompting paradigm inspired by the Feynman Learning Technique that lifts LLM-as-judge accuracy on JudgeBench with no fine-tuning
arXiv:2605.13695v1 Announce Type: cross Abstract: LLM-as-a-judge is now the default measurement instrument for open-ended generation, but on the public JudgeBench benchmark even strong instruction-tuned judges barely scrape past random on objective-correctness pairwise items. We introduce RTLC, ...
Source: arXiv - AI | 10 hours ago
4. Filter-then-Weight: Online Data Selection and Reweighting for LLM Fine-Tuning
arXiv:2604.00001v2 Announce Type: replace-cross Abstract: Gradient-based data selection offers a principled framework for estimating sample utility in large language model (LLM) fine-tuning, but existing methods are mostly designed for offline settings. They are therefore less suited to online f...
Source: arXiv - AI | 10 hours ago
5. Quantifying LLM Safety Degradation Under Repeated Attacks Using Survival Analysis
arXiv:2605.12869v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in a wide range of applications, yet remain vulnerable to adversarial jailbreak attacks that circumvent their safety guardrails. Existing evaluation frameworks typically report binary success...
Source: arXiv - AI | 10 hours ago
6. AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions
arXiv:2408.12935v4 Announce Type: replace Abstract: AI Safety is an emerging area of critical importance to the safe adoption and deployment of AI systems. With the rapid proliferation of AI and especially with the recent advancement of Generative AI (or GAI), the technology ecosystem behind the...
Source: arXiv - AI | 10 hours ago
7. Metis: Learning to Jailbreak LLMs via Self-Evolving Metacognitive Policy Optimization
arXiv:2605.10067v2 Announce Type: replace-cross Abstract: Red teaming is critical for uncovering vulnerabilities in Large Language Models (LLMs). While automated methods have improved scalability, existing approaches often rely on static heuristics or stochastic search, rendering them brittle ag...
Source: arXiv - AI | 10 hours ago
8. Taming the Long Tail: Rebalancing Adversarial Training via Adaptive Perturbation
arXiv:2605.13395v1 Announce Type: new Abstract: Deep neural networks are highly vulnerable to adversarial examples, i.e.,small perturbations that can significantly degrade model performance. While adversarial training has become the primary defense strategy, most studies focus on balanced datase...
Source: arXiv - Machine Learning | 10 hours ago
About This Digest
This digest is automatically curated from leading AI and tech news sources, filtered for relevance to AI security and the ML ecosystem. Stories are scored and ranked based on their relevance to model security, supply chain safety, and the broader AI landscape.
Want to see how your favorite models score on security? Check our model dashboard for trust scores on the top 500 HuggingFace models.