Here's your daily roundup of the most relevant AI and ML news for April 28, 2026. We're also covering 8 research developments. Click through to read the full articles from our curated sources.
Research & Papers
1. Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression
arXiv:2505.13527v4 Announce Type: replace-cross Abstract: Despite substantial advancements in aligning large language models (LLMs) with human values, current safety mechanisms remain susceptible to jailbreak attacks. We hypothesize that this vulnerability stems from distributional discrepancies...
Source: arXiv - AI | 10 hours ago
2. The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology
arXiv:2505.20435v3 Announce Type: replace-cross Abstract: Existing interpretability methods for Large Language Models (LLMs) predominantly capture linear directions or isolated features. This overlooks the high-dimensional, relational, and nonlinear geometry of model representations. We apply pe...
Source: arXiv - AI | 10 hours ago
3. Toward Principled LLM Safety Testing: Solving the Jailbreak Oracle Problem
arXiv:2506.17299v2 Announce Type: replace-cross Abstract: As large language models (LLMs) become increasingly deployed in safety-critical applications, the lack of systematic methods to assess their vulnerability to jailbreak attacks presents a critical security gap. We introduce the jailbreak o...
Source: arXiv - AI | 10 hours ago
4. Learning Without Adversarial Training: A Physics-Informed Neural Network for Secure Power System State Estimation under False Data Injection Attacks
arXiv:2604.22784v1 Announce Type: new Abstract: State estimation is a cornerstone of power system control-center operations, and its robust operation is increasingly a cyber-physical security concern as modern grids become more digitalized and communication-intensive. Neural network-based approa...
Source: arXiv - Machine Learning | 10 hours ago
5. Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training
arXiv:2604.24350v1 Announce Type: new Abstract: Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial attacks. However, FAT is prone to catastrophic overfitting (CO), wherein models overfit to the spec...
Source: arXiv - Machine Learning | 10 hours ago
6. UniAda: Universal Adaptive Multi-objective Adversarial Attack for End-to-End Autonomous Driving Systems
arXiv:2604.23362v1 Announce Type: cross Abstract: Adversarial attacks play a pivotal role in testing and improving the reliability of deep learning (DL) systems. Existing literature has demonstrated that subtle perturbations to the input can elicit erroneous outcomes, thereby substantially compr...
Source: arXiv - Machine Learning | 10 hours ago
7. SolarTformer: A Transformer Based Deep Learning Approach for Short Term Solar Power Forecasting
arXiv:2604.24306v1 Announce Type: new Abstract: Accurate forecasting of solar power output is essential for efficient integration of renewable energy into the grid. In this study, an attention-based deep learning model, inspired by transformer architecture, is used for short-term solar power for...
Source: arXiv - Machine Learning | 10 hours ago
8. A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations
arXiv:2604.24468v1 Announce Type: cross Abstract: Fine-tuning unlocks large language models (LLMs) for specialized applications, but its high computational cost often puts it out of reach for resource-constrained organizations. While cloud platforms could provide the needed resources, data priva...
Source: arXiv - Machine Learning | 10 hours ago
About This Digest
This digest is automatically curated from leading AI and tech news sources, filtered for relevance to AI security and the ML ecosystem. Stories are scored and ranked based on their relevance to model security, supply chain safety, and the broader AI landscape.
Want to see how your favorite models score on security? Check our model dashboard for trust scores on the top 500 HuggingFace models.