Here's your daily roundup of the most relevant AI and ML news for March 03, 2026. We're also covering 8 research developments. Click through to read the full articles from our curated sources.
Research & Papers
1. Adversarial D\'ej`a Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks
arXiv:2510.21910v3 Announce Type: replace Abstract: Large language models remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Defending against novel jailbreaks represents a critical challenge in AI safety. Adversarial training -- designed to make mode...
Source: arXiv - Machine Learning | 9 hours ago
2. Analyzing Physical Adversarial Example Threats to Machine Learning in Election Systems
arXiv:2603.00481v1 Announce Type: new Abstract: Developments in the machine learning voting domain have shown both promising results and risks. Trained models perform well on ballot classification tasks (> 99% accuracy) but are at risk from adversarial example attacks that cause misclassificatio...
Source: arXiv - Machine Learning | 9 hours ago
3. Untargeted Jailbreak Attack
arXiv:2510.02999v4 Announce Type: replace-cross Abstract: Existing gradient-based jailbreak attacks on Large Language Models (LLMs) typically optimize adversarial suffixes to align the LLM output with predefined target responses. However, restricting the objective as inducing fixed targets inher...
Source: arXiv - AI | 9 hours ago
4. JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models
arXiv:2505.17568v3 Announce Type: replace-cross Abstract: Large Audio Language Models (LALMs) have made significant progress. While increasingly deployed in real-world applications, LALMs face growing safety risks from jailbreak attacks that bypass safety alignment. However, there remains a lack...
Source: arXiv - AI | 9 hours ago
5. FT-Dojo: Towards Autonomous LLM Fine-Tuning with Language Agents
arXiv:2603.01712v1 Announce Type: new Abstract: Fine-tuning large language models for vertical domains remains a labor-intensive and expensive process, requiring domain experts to curate data, configure training, and iteratively diagnose model behavior. Despite growing interest in autonomous mac...
Source: arXiv - AI | 9 hours ago
6. TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?
arXiv:2603.00285v1 Announce Type: new Abstract: Evaluating AI agents in finance faces two key challenges: static benchmarks require costly expert annotation yet miss the dynamic decision-making central to real-world trading, while LLM-based judges introduce uncontrolled variance on domain-specif...
Source: arXiv - AI | 9 hours ago
7. GPU-Fuzz: Finding Memory Errors in Deep Learning Frameworks
arXiv:2602.10478v3 Announce Type: replace-cross Abstract: GPU memory errors are a critical threat to deep learning (DL) frameworks, leading to crashes or even security issues. We introduce GPU-Fuzz, a fuzzer locating these issues efficiently by modeling operator parameters as formal constraints....
Source: arXiv - Machine Learning | 9 hours ago
8. Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling
arXiv:2509.23325v2 Announce Type: replace-cross Abstract: Fine-tuning pretrained models is a standard and effective workflow in modern machine learning. However, robust fine-tuning (RFT), which aims to simultaneously achieve adaptation to a downstream task and robustness to adversarial examples,...
Source: arXiv - AI | 9 hours ago
About This Digest
This digest is automatically curated from leading AI and tech news sources, filtered for relevance to AI security and the ML ecosystem. Stories are scored and ranked based on their relevance to model security, supply chain safety, and the broader AI landscape.
Want to see how your favorite models score on security? Check our model dashboard for trust scores on the top 500 HuggingFace models.