Publications [Google Scholar] [dblp] (*: equal contribution)
- LLM Agents Should Employ Security Principles
Kaiyuan Zhang, Zian Su, Pin-Yu Chen, Elisa Bertino, Xiangyu Zhang, Ninghui Li
Preprint 2025
[paper] - SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks
Kaiyuan Zhang, Siyuan Cheng, Hanxi Guo, Yuetian Chen, Zian Su, Shengwei An, Yuntao Du, Charles Fleming, Ashish Kundu, Xiangyu Zhang, Ninghui Li
Proceedings of the 34th USENIX Security Symposium (Security 2025)
[paper (coming soon)] [code (coming soon)] - μKE: Matryoshka Unstructured Knowledge Editing of Large Language Models
Zian Su*, Ziyang Huang*, Kaiyuan Zhang, Xiangyu Zhang
Preprint 2025
[paper] [code (coming soon)] - ProSec: Fortifying Code LLMs with Proactive Security Alignment
Xiangzhe Xu*, Zian Su*, Jinyao Guo, Kaiyuan Zhang, Zhenting Wang, Xiangyu Zhang
Proceedings of the 42nd International Conference on Machine Learning (ICML 2025)
[paper] [code] - System Prompt Hijacking via Permutation Triggers in LLM Supply Chains
Lu Yan, Siyuan Cheng, Xuan Chen, Kaiyuan Zhang, Guangyu Shen, Xiangyu Zhang
Findings of the Association for Computational Linguistics (ACL Findings 2025)
[paper (coming soon)] - BAIT: Large Language Model Backdoor Scanning by Inverting Attack Target
Guangyu Shen*, Siyuan Cheng*, Zhuo Zhang, Guanhong Tao, Kaiyuan Zhang, Hanxi Guo, Lu Yan, Xiaolong Jin, Shengwei An, Shiqing Ma, Xiangyu Zhang
Proceedings of the 46th IEEE Symposiums on Security and Privacy (Oakland 2025)
[paper] [code] - CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
Kaiyuan Zhang, Siyuan Cheng, Guangyu Shen, Bruno Ribeiro, Shengwei An, Pin-Yu Chen, Xiangyu Zhang, Ninghui Li
Proceedings of the 32nd Network and Distributed System Security Symposium (NDSS 2025)
[paper] [code] [slides] [twitter] [website] - Source Code Foundation Models are Transferable Binary Analysis Knowledge Bases
Zian Su, Xiangzhe Xu, Ziyang Huang, Kaiyuan Zhang, Xiangyu Zhang
Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024)
[paper] [code] - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens
Hanxi Guo, Siyuan Cheng, Xiaolong Jin, Zhuo Zhang, Kaiyuan Zhang, Guanhong Tao, Guangyu Shen, Xiangyu Zhang
Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024)
[paper] [code] - UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening
Siyuan Cheng*, Guangyu Shen*, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Hanxi Guo, Shiqing Ma, Xiangyu Zhang
The 18th European Conference on Computer Vision (ECCV 2024)
[paper] [code] - Rethinking the Invisible Protection against Unauthorized Image Usage in Stable Diffusion
Shengwei An*, Lu Yan*, Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Qiuling Xu, Guanhong Tao, Xiangyu Zhang
Proceedings of the 33rd USENIX Security Symposium (Security 2024)
[paper] [code] - LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning
Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An, Shiwei Feng, Xiangzhe Xu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)
[paper] [code] - Exploring the Orthogonality and Linearity of Backdoor Attacks
Kaiyuan Zhang*, Siyuan Cheng*, Guangyu Shen, Guanhong Tao, Shengwei An, Anuran Makur, Shiqing Ma, Xiangyu Zhang
Proceedings of the 45th IEEE Symposium on Security and Privacy (Oakland 2024)
[paper] [code] [slides] [video] [poster] [website] - ODSCAN: Backdoor Scanning for Object Detection Models
Siyuan Cheng*, Guangyu Shen*, Guanhong Tao, Kaiyuan Zhang, Zhuo Zhang, Shengwei An, Xiangzhe Xu, Yingqi Liu, Shiqing Ma, Xiangyu Zhang
Proceedings of the 45th IEEE Symposium on Security and Privacy (Oakland 2024)
[paper] [code] - Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang, Qiuling Xu, Guanhong Tao, Guangyu Shen, Siyuan Cheng, Shiqing Ma, Pin-Yu Chen, Tsung-Yi Ho, Xiangyu Zhang
Proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)
[paper] [code] - ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP
Lu Yan, Zhuo Zhang, Guanhong Tao, Kaiyuan Zhang, Xuan Chen, Guangyu Shen, Xiangyu Zhang
Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
[paper] - Django: Detecting Trojans in Object Detection Models via Gaussian Focus Calibration
Guangyu Shen*, Siyuan Cheng*, Guanhong Tao, Kaiyuan Zhang, Yingqi Liu, Shengwei An, Shiqing Ma, Xiangyu Zhang
Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
[paper] [code] - Your Exploit is Mine: Instantly Synthesizing Counterattack Smart Contract
Zhuo Zhang, Zhiqiang Lin, Marcelo Morales, Xiangyu Zhang, Kaiyuan Zhang
Proceedings of the 32nd USENIX Security Symposium (Security 2023)
[paper] - ImU: Physical Impersonating Attack for Face Recognition System with Natural Style Changes
Shengwei An, Yuan Yao, Qiuling Xu, Shiqing Ma, Guanhong Tao, Siyuan Cheng, Kaiyuan Zhang, Yingqi Liu, Guangyu Shen, Ian Kelk, Xiangyu Zhang
Proceedings of the 44th IEEE Symposium on Security and Privacy (Oakland 2023)
[paper] [code] - Detecting Backdoors in Pre-trained Encoders
Shiwei Feng, Guanhong Tao, Siyuan Cheng, Guangyu Shen, Xiangzhe Xu, Yingqi Liu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR 2023)
[paper] [code] - BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense
Siyuan Cheng, Guanhong Tao, Yingqi Liu, Shengwei An, Xiangzhe Xu, Shiwei Feng, Guangyu Shen, Kaiyuan Zhang, Qiuling Xu, Shiqing Ma, Xiangyu Zhang
Proceedings of the 30th Network and Distributed System Security Symposium (NDSS 2023)
[paper] [code] - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning
Kaiyuan Zhang, Guanhong Tao, Qiuling Xu, Siyuan Cheng, Shengwei An, Yingqi Liu, Shiwei Feng, Guangyu Shen, Pin-Yu Chen, Shiqing Ma, Xiangyu Zhang
Proceedings of the Eleventh International Conference on Learning Representations (ICLR 2023)
Best Paper Award 🏆 in ECCV 2022 Workshop on Adversarial Robustness in the Real World (AROW 2022)
[paper] [code] [slides] [media coverage]