Here's your daily roundup of the most relevant AI and ML news for March 04, 2026. Today's digest includes 1 security-focused story. We're also covering 7 research developments. Click through to read the full articles from our curated sources.
Security & Safety
1. Show HN: Telos – eBPF/LSM Runtime Security for Autonomous AI Agents
We give autonomous AI agents shell access and API keys, relying on system prompts or Docker for security. This is fundamentally broken. When an agent is hit with an indirect prompt injection, it doesn't download a rootkit. It uses standard, signed binaries like curl or base64 to exfiltrate data. ...
Source: Hacker News - ML Security | just now
Research & Papers
2. NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion
arXiv:2505.20934v2 Announce Type: replace Abstract: Adversarial samples exploit irregularities in the manifold `learned' by deep learning models to cause misclassifications. The study of these adversarial samples provides insight into the features a model uses to classify inputs, which can be le...
Source: arXiv - Machine Learning | 9 hours ago
3. AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework
arXiv:2603.03233v1 Announce Type: new Abstract: Large Language Models (LLMs) demonstrate potentials for automating scientific code generation but face challenges in reliability, error propagation in multi-agent workflows, and evaluation in domains with ill-defined success metrics. We present a B...
Source: arXiv - AI | 9 hours ago
4. Understanding and Mitigating Dataset Corruption in LLM Steering
arXiv:2603.03206v1 Announce Type: new Abstract: Contrastive steering has been shown as a simple and effective method to adjust the generative behavior of LLMs at inference time. It uses examples of prompt responses with and without a trait to identify a direction in an intermediate activation la...
Source: arXiv - Machine Learning | 9 hours ago
5. Generative adversarial imitation learning for robot swarms: Learning from human demonstrations and trained policies
arXiv:2603.02783v1 Announce Type: cross Abstract: In imitation learning, robots are supposed to learn from demonstrations of the desired behavior. Most of the work in imitation learning for swarm robotics provides the demonstrations as rollouts of an existing policy. In this work, we provide a f...
Source: arXiv - Machine Learning | 9 hours ago
6. You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models
arXiv:2506.11103v2 Announce Type: replace-cross Abstract: Large language models (LLMs) possess a remarkable ability to perform in-context learning (ICL), which enables them to handle multiple downstream tasks simultaneously without requiring task-specific fine-tuning. Recent studies have shown t...
Source: arXiv - AI | 9 hours ago
7. Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking
arXiv:2507.08207v2 Announce Type: replace Abstract: This paper proposes a game theoretic framework that models the interaction between prompt engineers and large language models (LLMs) as a two player extensive form game coupled with a Rapidly exploring Random Trees (RRT) search over prompt spac...
Source: arXiv - AI | 9 hours ago
8. Towards a more realistic evaluation of machine learning models for bearing fault diagnosis
arXiv:2509.22267v3 Announce Type: replace Abstract: Reliable detection of bearing faults is essential for maintaining the safety and operational efficiency of rotating machinery. While recent advances in machine learning (ML), particularly deep learning, have shown strong performance in controll...
Source: arXiv - Machine Learning | 9 hours ago
About This Digest
This digest is automatically curated from leading AI and tech news sources, filtered for relevance to AI security and the ML ecosystem. Stories are scored and ranked based on their relevance to model security, supply chain safety, and the broader AI landscape.
Want to see how your favorite models score on security? Check our model dashboard for trust scores on the top 500 HuggingFace models.